v1.83.8-nightly
📦 litellmView on GitHub →
✨ 10 features🐛 20 fixes🔧 8 symbols
Summary
This release introduces Docker image signing via cosign for verification and adds new features like BM25-based prompt compression (litellm.compress()) and advisor tool orchestration. Numerous bug fixes address issues across S3 signing, caching, UI redirects, and provider integrations like Vertex AI and Dashscope.
Migration Steps
- If you were relying on the previous JWT auth Prometheus metric setting for key_alias, note that it was reverted.
- If you encountered issues with embedding requests to OpenAI, review the change regarding omitting null encoding_format (and its subsequent revert).
✨ New Features
- Introduced Docker image signing using cosign for enhanced security verification.
- Added 7m and 10m latency histogram buckets for Prometheus metrics.
- Implemented Dashscope feature to preserve cache_control for explicit prompt caching.
- Exposed reasoning effort fields in get_model_info and added support for together_ai/gpt-oss-120b.
- Added PromptGuard guardrail integration.
- Implemented advisor tool orchestration loop for non-Anthropic providers.
- Added wandb model offerings to include kimi-k2.5 and minimax-m2.5.
- Added litellm.compress() for BM25-based prompt compression with retrieval tool.
- Implemented per-team opt-out for specific global guardrails.
- UI: Allow Editing Router Settings After Team Creation.
🐛 Bug Fixes
- Fixed is_tool_name_prefixed to validate against known server prefixes.
- Fixed S3 v2 requests to use prepared URL for SigV4-signed requests.
- Fixed a 'multiple values' TypeError in get_cache_key related to caching.
- Fixed presidio integration to use correct text positions in anonymize_text.
- Resolved UI login redirect loop when a reverse proxy adds HttpOnly to cookies.
- Reverted setting key_alias=user_id in JWT auth for Prometheus metrics.
- Fixed Vertex AI to normalize Gemini finish_reason enum through map_finis…
- Removed leading space from license public_key.pem.
- Fixed cache invalidation double-hashing token in bulk update and key rotation.
- Fixed model_max_budget being silently broken for routed models.
- Fixed embedding requests to omit null encoding_format for openai requests (initially, then reverted).
- Aligned budget reset times for legacy entities (Team Members, End Users) with standardized calendar.
- Tightened handling of environment references in request parameters.
- Gated post-custom-auth DB lookups behind an opt-in flag for security.
- Fixed blog dark mode where text was invisible on a dark background.
- Aligned /spend/logs filter handling with user scoping.
- Fixed UI to pre-select backend default for boolean guardrail provider fields.
- Isolated logs team filter dropdown from root teams state bleed.
- Fixed cost-map entry for vertex qwen3-235b-a22b-instruct-2507-maas by adding us-south1 region.
- Fixed field-level checks in user and key update endpoints.