v1.81.9.rc.1
📦 litellmView on GitHub →
✨ 17 features🐛 28 fixes🔧 15 symbols
Summary
This release focuses heavily on bug fixes across various providers (including Anthropic, Vertex AI, and GigaChat) and significant enhancements to the UI, including new budget management features and model cost map updates. Key improvements were made to tracing, caching performance, and general stability.
Migration Steps
- If using A2A agents deployed with localhost/internal URLs in agent cards (e.g., http://0.0.0.0:8001/), review configuration due to fixes in A2A Agent Gateway.
- If relying on specific behavior related to merging consecutive user messages for GigaChat, note that this is now disabled.
✨ New Features
- Added faster linting targets for development workflow.
- UI: Show Config Defined Search Tools.
- UI: Add support for MCP Semantic Filtering.
- OpenRouter: Added Qwen3-235B models.
- Support TTL(1h) field in prompt caching for Bedrock Claude 4.5 models.
- CLI arguments for RDS IAM auth.
- [Feat] add `claude-opus-4-6` to model cost map.
- Add Claude Opus 4.6 support.
- UI: Add soft_budget to Team Table + Create/Update Endpoints.
- Web Search: Added gpt-5-search-api model and documentation clarifications.
- Added ElevenLabs eleven_v3 and eleven_multilingual_v2 to model cost map.
- Full support for Opus 4.6 (Anthropic, Azure AI, Bedrock, Vertex AI).
- Team Soft Budget Email Alerts.
- UI: Admin Settings: Add option for Authentication for public AI Hub.
- Added au version of `claude-opus-4-6` to model cost map.
- Add http support to custom code guardrails + Unified guardrails for MCP + Agent guardrail support.
- MCP Gateway: Allow setting MCP Servers as Private/Public available on Internet.
🐛 Bug Fixes
- Fixed search tools not being found when using per-request routers.
- Fixed Langfuse OpenTelemetry trace issues.
- Preserved streaming content on guardrail-sampled chunks.
- Fixed unique constraint on Daily Tables + Logging when updates fail.
- Fixed mypy regression: TypedDict key error in fireworks_ai transformation.
- Fixed inconsistent response format in anthropic.messages.acreate() when using non-Anthropic providers.
- Fixed unused Any/cast imports in github_copilot transformation.
- Disabled merging of consecutive user messages for GigaChat provider.
- Fixed Vertex AI Gemini streaming content_filter handling.
- Fixed 404 Not Found on /api/event_logging/batch endpoint.
- Fixed UI daily spend date filtering for user timezone.
- Fixed Non Root Dockerfile: Kept package-lock.json.
- Fixed test isolation for test_watsonx_gpt_oss_prompt_transformation.
- Fixed test isolation for test_log_langfuse_v2_handles_null_usage_values.
- Guardrails API: Ensured OpenAI Moderations Guard works with OpenAI Embeddings.
- Fixed gcs_bucket_name passing issue.
- Fixed array type checks for model, agent, and MCP hub data.
- Aligned Claude Opus 4.6 metadata and limits.
- Added unsupported claude code beta headers in json.
- Fixed budget metrics caching bug and reduced CPU usage in Prometheus.
- Added INFO-level session reuse logging per request.
- UI: Fixed Model Info Page: Input and Output Labels.
- UI: Fixed Model Page: Column Resizing on Smaller Screens.
- Fixed OAuth2 'Capabilities: none' bug for upstream MCP servers.
- A2a Agent Gateway Fixes: Addressed issue with A2A agents deployed with localhost/internal URLs in agent cards.
- Fixed Keys and Teams Router Setting + Allowed Override of Router Settings.
- Fixed SSO: Extracted user roles from JWT access token for Keycloak compatibility.
- Fixed mypy issues: resolved missing return statements and type casting issues.
Affected Symbols
fireworks_ai transformationgithub_copilot transformationanthropic.messages.acreate()Vertex AI GeminiBedrock Claude 4.5 modelsGigaChat providerOpenAI Embeddingsgpt-5-search-apieleven_v3eleven_multilingual_v2Claude Opus 4.6watsonx_gpt_oss_prompt_transformationlog_langfuse_v2_handles_null_usage_valuesKeycloak SSOMCP Gateway