v1.82.5.dev.1

📅 Mar 23, 2026📦 litellmView on GitHub →

✨ 11 features🐛 33 fixes🔧 34 symbols

Summary

This release focuses heavily on bug fixes and stability improvements across various providers like Anthropic, Gemini, and Vertex AI, alongside introducing new features such as Akto Guardrails integration and prompt management support for the Responses API.

Migration Steps

If using Anthropic and relying on default summary injection, note that an opt-out flag for default reasoning summary has been added.
If using Vertex AI for Gemini count_tokens, ensure configuration respects vertex_count_tokens_location if needed.

✨ New Features

Added Akto Guardrails integration to LiteLLM.
Introduced support for gpt-5.4 mini and nano models.
Added prompt management support for the Responses API.
Implemented file_search to align emulated Responses behavior with native output.
Added per-model-group deployment affinity for the router.
Added control plane for multi-proxy worker management.
Added Audit Log Export to External Callbacks.
Added context circulation support for server-side tool combination in Gemini.
Added cache_control_injection_points support for tool_config location in Bedrock.
Added Team MCP Server Manager Role (later reverted).
Added Gemini/Vertex AI prompt caching support.

🐛 Bug Fixes

Preserved thinking.summary when routing Anthropic calls to OpenAI Responses API.
Resolved image token undercounting in Gemini usage metadata.
Aligned translate_thinking_for_model with default summary injection for Anthropic.
Skipped #transform=inline for base64 data URLs in Fireworks.
Avoided no running event loop during sync initialization for Langsmith.
Supported images in tool_results for Gemini /v1/messages routing.
Corrected supported_regions for Vertex AI DeepSeek models pricing.
Restored gpt-4-0314 pricing information.
Fixed Redis cluster caching.
Converted max_budget to float when set via environment variable in proxy.
Mapped Anthropic 'refusal' finish reason to 'content_filter'.
Fixed Vertex AI streaming finish_reason for gemini-3.1-flash-lite-preview to be 'stop' instead of 'tool_calls'.
Mapped Chat Completion file type to Responses API input_file.
Respected vertex_count_tokens_location for Claude count_tokens in Vertex AI.
Preserved cache directive on file-type content blocks in Anthropic.
Preserved diarization segments in transcription response for Mistral.
Passed model to context caching URL builder for custom api_base in Gemini.
Auto-routed gpt-5.4+ tools+reasoning to Responses API in Azure.
Passed subpath auth for non-admin users in proxy.
Checked rate limits before creating polling ID in polling mechanism.
Respected api_base and aws_bedrock_runtime_endpoint in count_tokens endpoint for Bedrock.
Converted task_type to camelCase taskType for Gemini Embeddings API.
Supported batch cancel via Vertex API for Vertex AI.
Preserved annotations in Bing Search grounding responses for Azure AI Agents.
Merged hidden_params into metadata for streaming requests in logging.
Fixed cost_per_second calculation for audio transcription models (later reverted).
Preserved reasoning_content on Pydantic Message objects in multi-turn tool calls for Moonshot.
Added team_member_budget_duration to NewTeamRequest in proxy.
Fixed mock for get_auth_header instead of get_api_key in Anthropic file content test.
Added additionalProperties: false for OpenAI strict mode in Anthropic adapter.
Fixed alternating roles logic.
Fixed global secret redaction via root logger + key-name-based pattern matching.
Fixed UI AntD Messages Not Rendering.

Affected Symbols

Anthropic routing Gemini usage metadata translate_thinking_for_model fireworks base64 data URLs langsmith sync init Gemini tool_results Vertex AI DeepSeek models gpt-4-0314 Redis cluster caching max_budget environment variable Anthropic finish reason mapping Vertex AI gemini-3.1-flash-lite-preview streaming Chat Completion file type mapping Claude count_tokens Anthropic cache directive Mistral transcription response Gemini context caching URL builder Azure gpt-5.4+ tool routing useChatHistory hook ChatUI.tsx proxy subpath auth polling ID creation Bedrock count_tokens endpoint Gemini Embeddings task_type Vertex AI batch cancel Azure AI Agents Bing Search grounding streaming requests logging audio transcription cost_per_second calculation Moonshot Pydantic Message objects NewTeamRequest Anthropic adapter OpenAI strict mode Gemini context circulation Bedrock cache_control_injection_points root logger secret redaction