●Change8

v1.80.16.dev6

πŸ“¦ litellmView on GitHub β†’
✨ 8 featuresπŸ› 39 fixes⚑ 1 deprecationsπŸ”§ 23 symbols

Summary

This release introduces new model pricing, enhances guardrail functionality with a default failopen option, and includes numerous fixes across model integrations, pricing accuracy, and UI components. Performance bottlenecks under heavy load were also addressed.

Migration Steps

  1. If using Gemini, note that support for responseJsonSchema was temporarily reverted and is not available in this release.

✨ New Features

  • Implement failopen option default to True on grayswan guardrail.
  • Add pricing for azure_ai/claude-opus-4-5.
  • Add support for 0 cost models.
  • Allow passing scope id for watsonx inferencing.
  • Add Cerebras zai-glm-4.7 model support.
  • Add retry policy support to responses API.
  • Add contextual gap checks and word-form digits features.
  • Add Azure Model Router on LiteLLM AI Gateway.

πŸ› Bug Fixes

  • Update OpenTelemetry semantic conventions to 1.38 (gen_ai attributes).
  • Fix case-insensitive model cost map lookup.
  • Fix image tokens spend logging for /images/generations.
  • Fix header forwarding in bedrock passthrough.
  • Fix model matching priority in configuration.
  • Fix hoisting thread grouping metadata (session_id, thread…) in langsmith.py.
  • Fix properly handle custom guardrails parameters.
  • Fix use of non-streaming method for endpoint v1/a2a/message/send in UI.
  • Update novita models prices.
  • Normalize OpenAI SDK BaseModel choices/messages to avoid Pydantic serializer warnings.
  • Fix prompt deletion failing with Prisma FieldNotFoundError.
  • Fix Swagger UI path execute error with server_root_path in OpenAPI schema.
  • Correct context window sizes for GPT-5 model variants.
  • Fix Ollama: set finish_reason to tool_calls and remove broken capability check.
  • Sync DeepSeek chat/reasoner to V3.2 pricing.
  • Fix exception mapping to handle exceptions without response parameter.
  • Fix num_retries in litellm_params as per config.
  • Correct cache_read pricing for gemini-2.5-pro models.
  • Remove bottleneck causing high CPU usage & overhead under heavy load.
  • Fix UI: Usage - Team ID and Team Name in Export Report.
  • Do not fallback to token counter if disable_token_counter is enabled.
  • Fix MCP rest auth checks.
  • Fix generation of two telemetry in responses.
  • Fix UI: Usage - Model Activity Chart Y Axis.
  • Fix SCIM GET /Users error and enforce SCIM 2.0 compliance.
  • Fix Anthropic during call error.
  • Fix guardrails to use clean error messages for blocked requests.
  • Fix adding handling for user-disabled mid-stream fallbacks.
  • Fix responses content not being None.
  • Fix Anthropic token counter with thinking.
  • Fix Gemini Image Generation returning incorrect prompt_tokens_d….
  • Correct max_input_tokens for GPT-5 models.
  • Fix dynamic_rate_limiter_v3: fix TPM 25% limiting by ensuring priority.
  • Fix Vertex AI: improve passthrough endpoint url parsing and construction.
  • Fix Gemini: dereference $defs/$ref in tool response content.
  • Fix preserving llm_provider-* headers in error responses.
  • Fix keeping type field in Gemini schema when properties is empty.
  • Fix Azure Grok prices.
  • Fix Vertex AI: add type object to tool schemas missing type field.

Affected Symbols

⚑ Deprecations

  • Deprecate Cerebras zai-glm-4.6 model in favor of zai-glm-4.7.