v1.81.3.rc

📅 Jan 25, 2026📦 litellmView on GitHub →

✨ 13 features🐛 40 fixes🔧 7 symbols

Summary

This release focuses heavily on bug fixes across various providers (Bedrock, Ollama, Google) and infrastructure components like logging and routing. Key features include support for Azure OpenAI v1 API, enhanced retry mechanisms, and updates to Gemini and Volcengine integrations.

Migration Steps

Migrate Pillar Security to Generic Guardrail API (see docs for details)

✨ New Features

Add in-product nudge for claude code feedback survey + new learning centre
Add managed files support when load_balancing is True
Add ChatGPT subscription support and responses bridge
Add Redis-based migration lock with bug fixes
Add health check scripts and parallel execution support
feat: add retry_delay, exponential_backoff, and jitter to completion()
feat: add support for keda in helm chart
feat(azure): add support for Azure OpenAI v1 API
feat(gemini): use responseJsonSchema for Gemini 2.0+ models
[Feat] - Add self hosted Claude Code Plugin Marketplace
[Feat] UI - Allow Adding Claude Code Plugins
feat (volcengine) : Support Volcengine responses api
[feat] mcp version up

🐛 Bug Fixes

fix logfile and pidfile of supervisor for non root environment
Fix audio cost per second override (Note: This was reverted later, see below)
fix: add openai/dall-e base pricing entries
fix(tools): prevent OOM with nested $defs in tool schemas
chore: resolve ModuleNotFoundError for Microsoft Foundry Agents
[fix] responses api non OpenAI models
Fix: _handle_failure method getting called 2 times
Fix: upload pdfs for file endpoint
Fix: anthropic-beta is getting overriden and set to anthropic-beta
Fix Output None for replicate handler
fix(router): prevent retrying 4xx client errors
fix(agentcore): simplify agentcore streaming
fix(proxy): add /a2a/{agent_id}/.well-known/agent-card.json to agent_...
Fix: vector store sync issues
fix(responses): streaming with tool_choice allowed_tools
fix(logging): prevent duplicate StandardLoggingPayload logs
fix(langfuse_otel): ignore service logs and fix callback shadowing
fix(utils.py): correctly extract messages from google genai contents
[Fix] Bedrock stability model usage issues
fix/bedrock-inconsistent-postcall-hook
Fix HTML entity in survey description text
Fix : test_responses_streaming_failure_triggers_failure_handlers
[Fix] Claude Code x Bedrock Invoke fails with `advanced-tool-use-2025-11-20`
fix(realtime): disable SSL for ws:// WebSocket connections
fix: correct Groq gpt-oss pricing and add cache pricing
fix(gcs_bucket): prevent unbounded queue growth due to slow API calls
fix #19254 - [Bug]: litellm_params ignored by get_llm_provider function in completion() definition
fix(bedrock): deduplicate tool calls in assistant history (#15178)
[Fix] Fix Pass through routes to work with server root path
Fix #19357 - [Bug]: Tool call fails when using Ollama backend
fix: correct us.anthropic.claude-opus-4-5 In-region pricing
Fix queue persistence to Redis
fix: HTTP client memory leaks in Presidio, OpenAI, and Gemini
Fix: bedrock invoke claude 4 optional params #19318
Field-Existence Checks to Type Classes to Prevent Attribute Errors
fix(bedrock): handle thinking with tool calls for Claude 4 models
fix(responses): stream tool call events in completion bridge
fix: preserve tool output ordering for gemini in responses bridge
Fix extract_cacheable_prefix to handle string content with message-level cache_control
Revert "Fix audio cost per second override"

Affected Symbols

supervisor _handle_failure replicate handler completion()get_llm_provider function bedrock invoke extract_cacheable_prefix