v1.83.14.rc.1
📦 litellmView on GitHub →
✨ 8 features🐛 27 fixes🔧 16 symbols
Summary
This release introduces Docker image signing via cosign for enhanced security and adds support for new models like GPT-5.5/5.4 snapshots and Bedrock GLM-5. Numerous fixes address streaming issues, model pricing accuracy, and security hardening across proxy and authentication layers.
Migration Steps
- If you rely on Docker images, verify the image signature using the provided cosign commands to ensure authenticity.
✨ New Features
- Added Docker image signing using cosign; verification instructions provided for pinned commit hash (recommended) or release tag.
- Added GLM-5 and Minimax M2.5 models to Bedrock with regional aliases.
- Added support for versioned GPT-5.4 mini/nano snapshots.
- Added Day-0 support for GPT-5.5 and GPT-5.5 Pro.
- Added LLM-as-a-Judge guardrail.
- Added `use_chat_completions_api` flag for OpenAI/ models with custom api_base to route requests through the Responses API bridge.
- Added `route_all_chat_openai_to_responses` global flag for OpenAI.
- Added Send Invitation Email Toggle for Users in the UI.
🐛 Bug Fixes
- Fixed preservation of `tool_use` input arguments in Anthropic adapter streaming.
- Fixed preservation of `role='assistant'` in Azure streaming when `include_usage` is set.
- Fixed mapping of Zhipu GLM non-standard finish_reason values.
- Applied GPT-5 temperature validation for Responses API.
- Fixed sorting of assistant content blocks in Bedrock so text precedes toolUse.
- Fixed filtering of parameters from Gemini embedding requests.
- Fixed reading of web search cost from model_info instead of hardcoding for Gemini.
- Fixed inclusion of DOCUMENT modality tokens in Gemini cost calculation.
- Fixed forwarding of dimensions parameter in Vertex AI multimodal embedding requests.
- Migrated 38 models from legacy `max_tokens` to `max_input_tokens`/`max_output_tokens` in model pricing.
- Updated Bedrock Claude Sonnet/Opus 4.6 pricing for tokens above 200k and set Sonnet 4.6 `max_input_tokens` to 1M.
- Restored BYOK key injection for vector store endpoints with team-scoped deployments in the router.
- Split MCP routes into inference vs management to unblock Admin UI on DISABLE_LLM_API_ENDPOINTS nodes.
- Automatically added SSO team members to the organization upon move for proxy admin only.
- Fixed respecting object-level permissions for managed vector store endpoints in the proxy.
- Fixed normalization of bridged object fields in Responses API.
- Fixed preserving Anthropic messages call type for /v1/messages logging.
- Stripped `custom_tool_call` namespace for all providers in Responses API.
- Fixed stripping of Gemini thought suffix from streaming tool_use ID in Anthropic adapter.
- Isolated master_key/prisma_client module globals between Proxy tests.
- Centralized common_checks to close authorization bypass in authentication.
- Hardened OAuth authorize/token endpoints (BYOK + discoverable) in MCP.
- Fixed logging of :embedContent and :batchEmbedContents responses in Vertex AI passthrough.
- Applied team TPM/RPM + attribution for admins using x-litellm-team-id in JWT authentication.
- Fixed Guardrail parameter handling in list and submission endpoints.
- Fixed proxy DB tests by semantically sharding them and removing CCI/GHA test duplication.
- Fixed proxy: single-team DB fallback when JWT has no team ID.