v1.81.3.rc
📦 litellmView on GitHub →
✨ 13 features🐛 40 fixes🔧 7 symbols
Summary
This release focuses heavily on bug fixes across various providers (Bedrock, Ollama, Google) and infrastructure components like logging and routing. Key features include support for Azure OpenAI v1 API, enhanced retry mechanisms, and updates to Gemini and Volcengine integrations.
Migration Steps
- Migrate Pillar Security to Generic Guardrail API (see docs for details)
✨ New Features
- Add in-product nudge for claude code feedback survey + new learning centre
- Add managed files support when load_balancing is True
- Add ChatGPT subscription support and responses bridge
- Add Redis-based migration lock with bug fixes
- Add health check scripts and parallel execution support
- feat: add retry_delay, exponential_backoff, and jitter to completion()
- feat: add support for keda in helm chart
- feat(azure): add support for Azure OpenAI v1 API
- feat(gemini): use responseJsonSchema for Gemini 2.0+ models
- [Feat] - Add self hosted Claude Code Plugin Marketplace
- [Feat] UI - Allow Adding Claude Code Plugins
- feat (volcengine) : Support Volcengine responses api
- [feat] mcp version up
🐛 Bug Fixes
- fix logfile and pidfile of supervisor for non root environment
- Fix audio cost per second override (Note: This was reverted later, see below)
- fix: add openai/dall-e base pricing entries
- fix(tools): prevent OOM with nested $defs in tool schemas
- chore: resolve ModuleNotFoundError for Microsoft Foundry Agents
- [fix] responses api non OpenAI models
- Fix: _handle_failure method getting called 2 times
- Fix: upload pdfs for file endpoint
- Fix: anthropic-beta is getting overriden and set to anthropic-beta
- Fix Output None for replicate handler
- fix(router): prevent retrying 4xx client errors
- fix(agentcore): simplify agentcore streaming
- fix(proxy): add /a2a/{agent_id}/.well-known/agent-card.json to agent_...
- Fix: vector store sync issues
- fix(responses): streaming with tool_choice allowed_tools
- fix(logging): prevent duplicate StandardLoggingPayload logs
- fix(langfuse_otel): ignore service logs and fix callback shadowing
- fix(utils.py): correctly extract messages from google genai contents
- [Fix] Bedrock stability model usage issues
- fix/bedrock-inconsistent-postcall-hook
- Fix HTML entity in survey description text
- Fix : test_responses_streaming_failure_triggers_failure_handlers
- [Fix] Claude Code x Bedrock Invoke fails with `advanced-tool-use-2025-11-20`
- fix(realtime): disable SSL for ws:// WebSocket connections
- fix: correct Groq gpt-oss pricing and add cache pricing
- fix(gcs_bucket): prevent unbounded queue growth due to slow API calls
- fix #19254 - [Bug]: litellm_params ignored by get_llm_provider function in completion() definition
- fix(bedrock): deduplicate tool calls in assistant history (#15178)
- [Fix] Fix Pass through routes to work with server root path
- Fix #19357 - [Bug]: Tool call fails when using Ollama backend
- fix: correct us.anthropic.claude-opus-4-5 In-region pricing
- Fix queue persistence to Redis
- fix: HTTP client memory leaks in Presidio, OpenAI, and Gemini
- Fix: bedrock invoke claude 4 optional params #19318
- Field-Existence Checks to Type Classes to Prevent Attribute Errors
- fix(bedrock): handle thinking with tool calls for Claude 4 models
- fix(responses): stream tool call events in completion bridge
- fix: preserve tool output ordering for gemini in responses bridge
- Fix extract_cacheable_prefix to handle string content with message-level cache_control
- Revert "Fix audio cost per second override"