v1.80.10.rc.3

📅 Dec 17, 2025📦 litellm

✨ 21 features🐛 17 fixes🔧 29 symbols

Summary

This release adds numerous new providers and features—including Stability AI, Azure Cohere reranking, and VertexAI Agent Engine—while fixing a wide range of bugs and refactoring lazy imports for better performance.

✨ New Features

Added Stability AI image generation support.
Added Azure Cohere 4 reranking models.
Extended Guardrails API to support LLM tool call response checks on /chat/completions, /v1/responses, and /v1/messages (including streaming).
Added OpenRouter models GPT 5.2, Mistral 3, and Devstral 2.
Introduced reasoning parameter for Fireworks AI models.
Added provider‑specific tools support in the responses API.
Implemented Guardrails content filter feature.
Added image_edit and aimage_edit support for custom LLMs.
Introduced new provider "Agent Gateway" with Pydantic AI agents integration.
Added new provider "VertexAI Agent Engine".
Added masking support and MCP call support for the Pillar provider.
Added support for Signed URLs with query parameters in image processing.
Added UI component for Milvus Vector Store.
Enabled downloading Prisma binaries at build time for security‑restricted environments.
Added custom headers support in the responses API.
Added support for agent skills in chat completions.
Added Venice.ai provider via providers.json.
Added extra_headers support for Gemini batch embeddings.
Propagated token usage when generating images with Gemini.
Added MCP auth header propagation.
Added support for JSON payloads (instead of form‑data) for Gemini image edit requests.

🐛 Bug Fixes

Fixed Gemini web search request count.
Fixed Perplexity cost calculation to use API‑provided cost.
Fixed Anthropic dynamic max_tokens based on model.
Passed credentials to PredictionServiceClient for Vertex AI custom endpoints.
Fixed basemodel import in openai/responses/guardrail_translation.
Fixed cost calculation for gpt-image‑1 model.
Fixed MCP deepcopy error.
Set default max_tokens for Claude‑3‑7‑sonnet to 64K.
Added per‑token price for qwen3‑embedding‑8b input.
Switched Gemini image edit requests to JSON format.
Added missing headers to metadata for guardrails on pass‑through endpoints.
Skipped adding beta headers for Vertex AI where unsupported.
Added explicit `none` value to encoding_format instead of omitting it.
Fixed managed files endpoint handling.
Fixed model extraction from Vertex AI passthrough URLs in get_model_from_request().
Removed ttl field when routing to Bedrock.
Fixed header handling for guardrails on pass‑through endpoints (duplicate fix consolidated).

🔧 Affected Symbols

litellm/init.pyget_modified_max_tokensLLMClientCachebedrock types.types.utilsdotprompt integrationdefault encoding from client decoratorheavy client decorator importsheavy imports from litellm.mainAmazonConversePredictionServiceClientGuardrails API classesMCP auth header propagation logiccustom headers handling in responses APIimage_editaimage_editStabilityAI provider classAzureCohereRerank provider classFireworksAI reasoning parameterprovider‑specific tools handlingAgentGateway provider classVertexAIEngine provider classPillar masking functionsSigned URL processing utilitiesMilvus vector store UI modulePrisma binary downloaderVeniceAI provider classGemini batch embeddings functionGemini image generation token usage tracker