Change8

v1.82.5.dev.1

📦 litellmView on GitHub →
11 features🐛 33 fixes🔧 34 symbols

Summary

This release focuses heavily on bug fixes and stability improvements across various providers like Anthropic, Gemini, and Vertex AI, alongside introducing new features such as Akto Guardrails integration and prompt management support for the Responses API.

Migration Steps

  1. If using Anthropic and relying on default summary injection, note that an opt-out flag for default reasoning summary has been added.
  2. If using Vertex AI for Gemini count_tokens, ensure configuration respects vertex_count_tokens_location if needed.

✨ New Features

  • Added Akto Guardrails integration to LiteLLM.
  • Introduced support for gpt-5.4 mini and nano models.
  • Added prompt management support for the Responses API.
  • Implemented file_search to align emulated Responses behavior with native output.
  • Added per-model-group deployment affinity for the router.
  • Added control plane for multi-proxy worker management.
  • Added Audit Log Export to External Callbacks.
  • Added context circulation support for server-side tool combination in Gemini.
  • Added cache_control_injection_points support for tool_config location in Bedrock.
  • Added Team MCP Server Manager Role (later reverted).
  • Added Gemini/Vertex AI prompt caching support.

🐛 Bug Fixes

  • Preserved thinking.summary when routing Anthropic calls to OpenAI Responses API.
  • Resolved image token undercounting in Gemini usage metadata.
  • Aligned translate_thinking_for_model with default summary injection for Anthropic.
  • Skipped #transform=inline for base64 data URLs in Fireworks.
  • Avoided no running event loop during sync initialization for Langsmith.
  • Supported images in tool_results for Gemini /v1/messages routing.
  • Corrected supported_regions for Vertex AI DeepSeek models pricing.
  • Restored gpt-4-0314 pricing information.
  • Fixed Redis cluster caching.
  • Converted max_budget to float when set via environment variable in proxy.
  • Mapped Anthropic 'refusal' finish reason to 'content_filter'.
  • Fixed Vertex AI streaming finish_reason for gemini-3.1-flash-lite-preview to be 'stop' instead of 'tool_calls'.
  • Mapped Chat Completion file type to Responses API input_file.
  • Respected vertex_count_tokens_location for Claude count_tokens in Vertex AI.
  • Preserved cache directive on file-type content blocks in Anthropic.
  • Preserved diarization segments in transcription response for Mistral.
  • Passed model to context caching URL builder for custom api_base in Gemini.
  • Auto-routed gpt-5.4+ tools+reasoning to Responses API in Azure.
  • Passed subpath auth for non-admin users in proxy.
  • Checked rate limits before creating polling ID in polling mechanism.
  • Respected api_base and aws_bedrock_runtime_endpoint in count_tokens endpoint for Bedrock.
  • Converted task_type to camelCase taskType for Gemini Embeddings API.
  • Supported batch cancel via Vertex API for Vertex AI.
  • Preserved annotations in Bing Search grounding responses for Azure AI Agents.
  • Merged hidden_params into metadata for streaming requests in logging.
  • Fixed cost_per_second calculation for audio transcription models (later reverted).
  • Preserved reasoning_content on Pydantic Message objects in multi-turn tool calls for Moonshot.
  • Added team_member_budget_duration to NewTeamRequest in proxy.
  • Fixed mock for get_auth_header instead of get_api_key in Anthropic file content test.
  • Added additionalProperties: false for OpenAI strict mode in Anthropic adapter.
  • Fixed alternating roles logic.
  • Fixed global secret redaction via root logger + key-name-based pattern matching.
  • Fixed UI AntD Messages Not Rendering.

Affected Symbols