v1.87.0-rc.2
📦 litellmView on GitHub →
✨ 10 features🐛 21 fixes🔧 22 symbols
Summary
This release focuses heavily on stability, bug fixes across various providers (Bedrock, Cohere, Deepseek, Vertex AI), and significant feature additions including Gemini 3.5 Flash support and enhanced Prometheus metrics. It also introduces instructions for verifying Docker image signatures.
Migration Steps
- Users running Docker images should verify image signatures using the provided cosign commands (pinned commit hash recommended).
✨ New Features
- Added support for Gemini 3.5 Flash (Day 0 support).
- Added support for Gemini managed agents.
- Added gemini-3.1-flash-lite model cost map.
- Added user_email and user_alias to Prometheus user budget metrics.
- Added Interactions API endpoint to playground with SSE streaming.
- Propagate team_id and team_alias to all child OTEL spans.
- Added native MCP OAuth support for cursor.
- Migrated to Google Interactions API steps schema (May 2026).
- Added team passthrough routes create parity + edit load fix in UI.
- Added granian as an ASGI compliant web server for better throughput stability.
🐛 Bug Fixes
- Proxy admins can now be gated for team allowed_passthrough_routes.
- Fixed Bedrock/Cohere to send embedding_types as JSON array, not string.
- Fixed caching to replay openai/responses bridge cache hits as chat streams.
- Sanitized batch metadata in Bedrock to prevent Pydantic ValidationError.
- Fixed Deepseek to use native /anthropic/v1/messages endpoint and sanitize tools.
- Fixed proxy to decode bytes and pass-through SSE for Google-native streamGenerateContent.
- Replaced shut-down gpt-4o-audio-preview with gpt-audio-1.5 in tests.
- Seeded Redis counter via SET NX in spend_counter to prevent cross-pod double-seed.
- Normalized batch file IDs before ManagedObjectTable write in proxy.
- Used forwarded model_id for native Azure container IDs in router.
- Restored log filter loading indicator in UI.
- Fixed JWT on tools/list and REST tools/call server resolution in MCP.
- Fixed Vertex AI to omit function_call id on Vertex Gemini 3.5+ tool turns.
- Fixed interactions to never drop streamed text deltas; always emit terminal completion.
- Exposed Prisma idle/connect timeout + extra DB URL params in proxy.
- Serialized guardrail_response to JSON in OTEL traces.
- Hydrated wildcard discovery credentials in proxy.
- Stripped `context_management` from request body for vertex_gemma.
- Recalculated cost after router retry failures.
- Emitted guardrail span on violation, surfacing status + categories in OTEL.
- Tolerated transient 500 in google maps grounding test for vertex_ai.
Affected Symbols
bedrock/cohereopenai/responsesbedrockdeepseek/anthropic/v1/messagesGoogle-native streamGenerateContentbedrock/sagemakergpt-4o-audio-previewgpt-audio-1.5Redisbatch file IDsManagedObjectTableAzure container IDsMCPVertex Gemini 3.5+PrismaOTEL tracesvertex_gemmarouter retry failuresguardrail_responsewildcard discovery credentialsgoogle maps grounding test