v1.82.6-nightly
📦 litellmView on GitHub →
✨ 9 features🐛 34 fixes🔧 17 symbols
Summary
This release focuses heavily on stability and feature parity across various providers, including significant fixes for Anthropic reasoning summaries, Gemini image handling, and Vertex AI pricing/streaming. It also introduces new features like Akto Guardrails integration and control plane management.
Migration Steps
- If using Anthropic and you wish to opt-out of the default reasoning summary injection, use the new opt-out flag.
- If you rely on specific behavior for Vertex AI count_tokens for Claude, ensure vertex_count_tokens_location is correctly set.
- If you were relying on the previous cost_per_second calculation for audio transcription models, note that the fix was reverted, implying the original behavior or a different fix is now in place.
✨ New Features
- Added Akto Guardrails integration to LiteLLM.
- Introduced support for gpt 5.4 mini and nano models.
- Added prompt management support for the Responses API.
- Implemented file_search to align emulated Responses behavior with native output.
- Added per-model-group deployment affinity for the router.
- Added control plane for multi-proxy worker management.
- Added Audit Log Export to External Callbacks.
- Added support for context circulation for server-side tool combination in Gemini.
- Added support for cache_control_injection_points for tool_config location in Bedrock.
🐛 Bug Fixes
- Preserved thinking.summary when routing Anthropic requests to OpenAI Responses API.
- Resolved image token undercounting in usage metadata for Gemini.
- Aligned translate_thinking_for_model with default summary injection for Anthropic.
- Skipped #transform=inline for base64 data URLs for Fireworks.
- Avoided no running event loop during sync initialization for Langsmith.
- Supported images in tool_results for /v1/messages routing in Gemini.
- Corrected supported_regions for Vertex AI DeepSeek models in model-prices.
- Restored gpt-4-0314 pricing information.
- Fixed Redis cluster caching.
- Converted max_budget to float when set via environment variable in proxy.
- Mapped Anthropic 'refusal' finish reason to 'content_filter'.
- Fixed streaming finish_reason for gemini-3.1-flash-lite-preview on Vertex AI to be 'stop' instead of 'tool_calls'.
- Mapped Chat Completion file type to Responses API input_file.
- Respected vertex_count_tokens_location for Claude count_tokens on Vertex AI.
- Preserved cache directive on file-type content blocks for Anthropic.
- Preserved diarization segments in transcription response for Mistral.
- Passed model to context caching URL builder for custom api_base in Gemini.
- Auto-routed gpt-5.4+ tools+reasoning to Responses API for Azure.
- Fixed cost_per_second calculation for audio transcription models (reverted in later PR, but noted here).
- Preserved reasoning_content on Pydantic Message objects in multi-turn tool calls for Moonshot.
- Passed subpath auth for non-admin users in proxy.
- Checked rate limits before creating polling ID in polling mechanism.
- Respected api_base and aws_bedrock_runtime_endpoint in count_tokens endpoint for Bedrock.
- Converted task_type to camelCase taskType for Gemini embeddings API.
- Supported batch cancel via Vertex API for Vertex AI.
- Preserved annotations in Bing Search grounding responses for Azure AI Agents.
- Merged hidden_params into metadata for streaming requests in logging.
- Fixed cost_per_second calculation for audio transcription models (This fix was reverted later).
- Preserved reasoning_content on Pydantic Message objects in multi-turn tool calls for Moonshot.
- Added team_member_budget_duration to NewTeamRequest in proxy.
- Mocked get_auth_header instead of get_api_key in anthropic file content test.
- Added additionalProperties: false for OpenAI strict mode in Anthropic adapter.
- Fixed global secret redaction via root logger + key-name-based pattern matching.
- Fixed key tags dropdown creation.