Change8

v1.81.9-nightly

📦 litellmView on GitHub →
15 features🐛 29 fixes🔧 14 symbols

Summary

This release focuses heavily on bug fixes across various providers (including Vertex AI, GigaChat, and Anthropic), significant UI enhancements for administration and usage tracking, and expanded model support, notably for Claude Opus 4.6 and new OpenRouter/ElevenLabs models.

Migration Steps

  1. If using A2A agents deployed with localhost/internal URLs in agent cards, review configuration due to fixes in A2A Agent Gateway.
  2. If relying on specific SSE formatting for message/stream endpoint related to A2A, note that the change to use text/event-stream SSE format was reverted.

✨ New Features

  • Added faster linting targets for development workflow.
  • UI: Show Config Defined Search Tools.
  • UI: Add support for MCP Semantic Filtering.
  • OpenRouter: Added Qwen3-235B models.
  • Support TTL(1h) field in prompt caching for Bedrock Claude 4.5 models.
  • Added `claude-opus-4-6` to model cost map.
  • Added Claude Opus 4.6 support.
  • UI: Add soft_budget to Team Table + Create/Update Endpoints.
  • Web Search: Added gpt-5-search-api model and documentation clarifications.
  • Added ElevenLabs eleven_v3 and eleven_multilingual_v2 to model cost map.
  • Team Soft Budget Email Alerts.
  • UI: Admin Settings: Add option for Authentication for public AI Hub.
  • Added au version of `claude-opus-4-6` to model cost map.
  • Add http support to custom code guardrails + Unified guardrails for MCP + Agent guardrail support.
  • MCP Gateway: Allow setting MCP Servers as Private/Public available on Internet.

🐛 Bug Fixes

  • Fixed search tools not being found when using per-request routers.
  • Fixed Langfuse OpenTelemetry trace issues.
  • Preserved streaming content on guardrail-sampled chunks.
  • Fixed Unique Constraint on Daily Tables + Logging When Updates Fail.
  • Fixed mypy regression: TypedDict key error in fireworks_ai transformation.
  • Fixed use of text/event-stream SSE format for message/stream endpoint (reverted in subsequent PR).
  • Fixed inconsistent response format in anthropic.messages.acreate() when using non-anthropic providers.
  • Removed unused Any/cast imports in github_copilot transformation.
  • Disabled merging of consecutive user messages for GigaChat provider.
  • Fixed Vertex AI Gemini streaming content_filter handling.
  • Fixed 404 Not Found on /api/event_logging/batch endpoint.
  • Adjusted daily spend date filtering for user timezone in UI.
  • Fixed Non Root Dockerfile: Kept package-lock.json.
  • Fixed test isolation for test_watsonx_gpt_oss_prompt_transformation.
  • Fixed test isolation for test_log_langfuse_v2_handles_null_usage_values.
  • Guardrails API: Ensured OpenAI Moderations Guard works with OpenAI Embeddings.
  • Ensured gcs_bucket_name passes through correctly.
  • Added array type checks for model, agent, and MCP hub data.
  • Aligned Claude Opus 4.6 metadata and limits.
  • Added unsupported claude code beta headers in json.
  • Fixed budget metrics parallelization, caching bug, and reduced CPU usage in Prometheus.
  • Warned when budget lookup fails; cache won't populate.
  • Fixed UI - Model Info Page: Input and Output Labels.
  • Fixed UI - Model Page: Column Resizing on Smaller Screens.
  • Fixed OAuth2 'Capabilities: none' bug for upstream MCP servers.
  • A2a Agent Gateway Fixes: Addressed issue with A2A agents deployed with localhost/internal URLs in agent cards.
  • Re-issued fix for Keys and Teams Router Setting + Allowed Override of Router Settings.
  • Fixed SSO: Extracted user roles from JWT access token for Keycloak compatibility.
  • Fixed mypy issues: resolved missing return statements and type casting issues.

Affected Symbols