v1.80.16.dev6
π¦ litellmView on GitHub β
β¨ 8 featuresπ 39 fixesβ‘ 1 deprecationsπ§ 23 symbols
Summary
This release introduces new model pricing, enhances guardrail functionality with a default failopen option, and includes numerous fixes across model integrations, pricing accuracy, and UI components. Performance bottlenecks under heavy load were also addressed.
Migration Steps
- If using Gemini, note that support for responseJsonSchema was temporarily reverted and is not available in this release.
β¨ New Features
- Implement failopen option default to True on grayswan guardrail.
- Add pricing for azure_ai/claude-opus-4-5.
- Add support for 0 cost models.
- Allow passing scope id for watsonx inferencing.
- Add Cerebras zai-glm-4.7 model support.
- Add retry policy support to responses API.
- Add contextual gap checks and word-form digits features.
- Add Azure Model Router on LiteLLM AI Gateway.
π Bug Fixes
- Update OpenTelemetry semantic conventions to 1.38 (gen_ai attributes).
- Fix case-insensitive model cost map lookup.
- Fix image tokens spend logging for /images/generations.
- Fix header forwarding in bedrock passthrough.
- Fix model matching priority in configuration.
- Fix hoisting thread grouping metadata (session_id, threadβ¦) in langsmith.py.
- Fix properly handle custom guardrails parameters.
- Fix use of non-streaming method for endpoint v1/a2a/message/send in UI.
- Update novita models prices.
- Normalize OpenAI SDK BaseModel choices/messages to avoid Pydantic serializer warnings.
- Fix prompt deletion failing with Prisma FieldNotFoundError.
- Fix Swagger UI path execute error with server_root_path in OpenAPI schema.
- Correct context window sizes for GPT-5 model variants.
- Fix Ollama: set finish_reason to tool_calls and remove broken capability check.
- Sync DeepSeek chat/reasoner to V3.2 pricing.
- Fix exception mapping to handle exceptions without response parameter.
- Fix num_retries in litellm_params as per config.
- Correct cache_read pricing for gemini-2.5-pro models.
- Remove bottleneck causing high CPU usage & overhead under heavy load.
- Fix UI: Usage - Team ID and Team Name in Export Report.
- Do not fallback to token counter if disable_token_counter is enabled.
- Fix MCP rest auth checks.
- Fix generation of two telemetry in responses.
- Fix UI: Usage - Model Activity Chart Y Axis.
- Fix SCIM GET /Users error and enforce SCIM 2.0 compliance.
- Fix Anthropic during call error.
- Fix guardrails to use clean error messages for blocked requests.
- Fix adding handling for user-disabled mid-stream fallbacks.
- Fix responses content not being None.
- Fix Anthropic token counter with thinking.
- Fix Gemini Image Generation returning incorrect prompt_tokens_dβ¦.
- Correct max_input_tokens for GPT-5 models.
- Fix dynamic_rate_limiter_v3: fix TPM 25% limiting by ensuring priority.
- Fix Vertex AI: improve passthrough endpoint url parsing and construction.
- Fix Gemini: dereference $defs/$ref in tool response content.
- Fix preserving llm_provider-* headers in error responses.
- Fix keeping type field in Gemini schema when properties is empty.
- Fix Azure Grok prices.
- Fix Vertex AI: add type object to tool schemas missing type field.
Affected Symbols
grayswan guardrailotellangsmith.pyazure_ai/claude-opus-4-5bedrock passthroughOpenAI SDK BaseModelwatsonx inferencingPrisma FieldNotFoundErrorSwagger UIOllamaDeepSeek chat/reasonergemini-2.5-proOpenAI SDK BaseModel choices/messagesv1/a2a/message/sendnovita modelsGPT-5 model variantsnum_retries in litellm_paramsSCIM GET /UsersAnthropicGemini Image Generationdynamic_rate_limiter_v3vertex_ai passthrough endpointllm_provider-* headers
β‘ Deprecations
- Deprecate Cerebras zai-glm-4.6 model in favor of zai-glm-4.7.