Change8

v1.80.15.rc.1

📦 litellmView on GitHub →
18 features🐛 34 fixes🔧 32 symbols

Summary

This release introduces several new provider integrations, enhanced Prometheus metrics for monitoring, and numerous bug fixes across proxying, routing, and provider configurations. Performance improvements were also made to provider configuration lookups.

Migration Steps

  1. If using proxy server streaming, be aware that initial streaming errors will now return a JSON error response instead of SSE format.

✨ New Features

  • Calculate `total_tokens` manually if missing but derivable from input and output tokens.
  • Add built-in migration lock to prevent concurrent Prisma migrate deploy (Note: This feature was immediately reverted).
  • Add Prometheus metrics for request queue time and guardrails.
  • Add Anthropic cache control option to image tool call results.
  • Add Prometheus caching metrics for cache hits, misses, and tokens.
  • Add abliteration.ai provider support.
  • OpenRouter embeddings API support.
  • Support toggling tag matching between ANY and ALL for tag-routing.
  • Add custom proxy base URL support to Playground UI.
  • Add Key and Team Router Setting functionality.
  • Add Focus export support.
  • Add qualifire eval webhook.
  • Add support for Vertex AI API keys.
  • Add mcp registry.
  • Add Bedrock as a backend API for token counting.
  • Add memory leak detection tests with CI integration.
  • Add Endpoint Activity in Usage to the UI.
  • Update prices json for novita provider.

🐛 Bug Fixes

  • Revert the addition of the built-in migration lock.
  • Proxy: Return JSON error response instead of SSE format for initial streaming errors.
  • Fix embeddings calltype for guardrail precallhook.
  • Prevent duplicate User-Agent tags in request_tags.
  • Make `base_connection_pool_limit` default value consistent.
  • Braintrust: Pass `span_attributes` in async logging and skip tags on non-root spans.
  • Fix Gemini support for snake_case for google_search tool parameters.
  • Proxy: Use async anthropic client to prevent event loop blocking.
  • Properly use litellm api keys.
  • Add index on LOWER(user_email) for faster duplicate email checks.
  • Mask extra header secrets in model info.
  • Add `xiaomi_mimo` to LlmProviders enum to fix router support.
  • Fix workflow issue with labeling using a working regex pattern.
  • Proactive RDS IAM token refresh to prevent 15-min connection failures.
  • Normalize Proxy Config Callback.
  • Fix how to execute cloudzero sql.
  • Improve error messages and validation for wildcard routing with multiple credentials.
  • Add `thought_signatures` to VertexGeminiConfig and test.
  • Fix Bedrock Nova model detection for Bedrock provider.
  • Prevent expired key plaintext leak in error response.
  • Prevent Prisma migration workflow from running in forks.
  • Add logprobs support for Azure OpenAI GPT-5.2 model.
  • Fix `response_format` leaking into `extra_body`.
  • Fix litellm sdk embedding headers missing field.
  • Fix google_genai streaming adapter provider handling.
  • Align `max_tokens` with `max_output_tokens` for consistency.
  • Watsonx Audio Transcription: filter model field.
  • Enforce org level max budget.
  • Fix bedrock_cache, metadata and max_model_budget.
  • Fix test_count_tokens_caching.
  • Fix UI Login Case Sensitivity.
  • Fix mcp error in multiple servers.
  • Add Custom CA certificates to boto3 clients.
  • Fix `turn_off_message_logging` not redacting Request Messages in `proxy_server_request` field when stored to Database.

🔧 Affected Symbols

AnthropicPrisma migrate deployUser-Agentbase_connection_pool_limitabliteration.aibraintrustspan_attributesgoogle_searchgeminianthropic clientlitellm api keysuser_emailxiaomi_mimoLlmProvidersVertexGeminiConfigBedrockNova modelAzure OpenAI GPT-5.2logprobsresponse_formatextra_bodylitellm sdk embedding headersgoogle_genai streaming adaptermax_tokensmax_output_tokensWatsonx Audio Transcriptionbedrock_cachetest_count_tokens_cachingmcpboto3 clientsturn_off_message_loggingproxy_server_request