Change8

v1.83.14.rc.1

📦 litellmView on GitHub →
8 features🐛 27 fixes🔧 16 symbols

Summary

This release introduces Docker image signing via cosign for enhanced security and adds support for new models like GPT-5.5/5.4 snapshots and Bedrock GLM-5. Numerous fixes address streaming issues, model pricing accuracy, and security hardening across proxy and authentication layers.

Migration Steps

  1. If you rely on Docker images, verify the image signature using the provided cosign commands to ensure authenticity.

✨ New Features

  • Added Docker image signing using cosign; verification instructions provided for pinned commit hash (recommended) or release tag.
  • Added GLM-5 and Minimax M2.5 models to Bedrock with regional aliases.
  • Added support for versioned GPT-5.4 mini/nano snapshots.
  • Added Day-0 support for GPT-5.5 and GPT-5.5 Pro.
  • Added LLM-as-a-Judge guardrail.
  • Added `use_chat_completions_api` flag for OpenAI/ models with custom api_base to route requests through the Responses API bridge.
  • Added `route_all_chat_openai_to_responses` global flag for OpenAI.
  • Added Send Invitation Email Toggle for Users in the UI.

🐛 Bug Fixes

  • Fixed preservation of `tool_use` input arguments in Anthropic adapter streaming.
  • Fixed preservation of `role='assistant'` in Azure streaming when `include_usage` is set.
  • Fixed mapping of Zhipu GLM non-standard finish_reason values.
  • Applied GPT-5 temperature validation for Responses API.
  • Fixed sorting of assistant content blocks in Bedrock so text precedes toolUse.
  • Fixed filtering of parameters from Gemini embedding requests.
  • Fixed reading of web search cost from model_info instead of hardcoding for Gemini.
  • Fixed inclusion of DOCUMENT modality tokens in Gemini cost calculation.
  • Fixed forwarding of dimensions parameter in Vertex AI multimodal embedding requests.
  • Migrated 38 models from legacy `max_tokens` to `max_input_tokens`/`max_output_tokens` in model pricing.
  • Updated Bedrock Claude Sonnet/Opus 4.6 pricing for tokens above 200k and set Sonnet 4.6 `max_input_tokens` to 1M.
  • Restored BYOK key injection for vector store endpoints with team-scoped deployments in the router.
  • Split MCP routes into inference vs management to unblock Admin UI on DISABLE_LLM_API_ENDPOINTS nodes.
  • Automatically added SSO team members to the organization upon move for proxy admin only.
  • Fixed respecting object-level permissions for managed vector store endpoints in the proxy.
  • Fixed normalization of bridged object fields in Responses API.
  • Fixed preserving Anthropic messages call type for /v1/messages logging.
  • Stripped `custom_tool_call` namespace for all providers in Responses API.
  • Fixed stripping of Gemini thought suffix from streaming tool_use ID in Anthropic adapter.
  • Isolated master_key/prisma_client module globals between Proxy tests.
  • Centralized common_checks to close authorization bypass in authentication.
  • Hardened OAuth authorize/token endpoints (BYOK + discoverable) in MCP.
  • Fixed logging of :embedContent and :batchEmbedContents responses in Vertex AI passthrough.
  • Applied team TPM/RPM + attribution for admins using x-litellm-team-id in JWT authentication.
  • Fixed Guardrail parameter handling in list and submission endpoints.
  • Fixed proxy DB tests by semantically sharding them and removing CCI/GHA test duplication.
  • Fixed proxy: single-team DB fallback when JWT has no team ID.

Affected Symbols