Change8

v1.81.7-nightly

📦 litellmView on GitHub →
16 features🐛 37 fixes🔧 33 symbols

Summary

This release focuses heavily on bug fixes across various providers (Gemini, AWS, Azure, Anthropic) and significant enhancements to the UI, including new usage reporting features and improved search capabilities. New models and integrations, such as the Claude Agent SDK, were also added.

Migration Steps

  1. Run `prisma generate` as nobody user in non-root container if using Docker.
  2. If using Vertex AI with Anthropic models, ensure image URLs are correctly handled as base64 in tool messages.

✨ New Features

  • Added /list endpoint to the Search API to list existing search tools in the router.
  • Implemented a Reusable Table Sort Component in the UI.
  • Added Error message search functionality to UI spend logs.
  • Added permission management for users and teams in LiteLLM Vector Stores.
  • Added new OpenRouter models: `xiaomi/mimo-v2-flash`, `z-ai/glm-4.…`.
  • Added OpenRouter Kimi K2.5 model support.
  • Added usage export breakdown by Teams and Keys in the UI.
  • Added /openai_passthrough route for OpenAI passthrough requests.
  • Added `/delete` endpoint support for Gemini.
  • Added cost tracking and usage object in `aretrieve_batch` call type.
  • Added usage breakdown by Model Per Key in the UI.
  • Added mock client factory pattern and mock support for PostHog, Helicone, and Braintrust integrations.
  • Added Realtime API benchmarks.
  • Added `async_post_call_response_headers_hook` to CustomLogger.
  • Added LiteLLM x Claude Agent SDK Integration.
  • Added cookbook section for using Claude Agent SDK + MCPs with LiteLLM.

🐛 Bug Fixes

  • Fixed guardrails issues related to streaming-response regex.
  • Fixed migration issue and stabilized image.
  • Filtered unsupported beta headers for AWS Bedrock Invoke API.
  • Allowed `tool_choice` for Azure GPT-5 chat models.
  • Fixed tool usage issue with Anthropic (#19800).
  • Ensured BadRequestError is inspected after all other policy types.
  • Used local tiktoken cache during lazy loading.
  • Subtracted implicit cached tokens from `text_tokens` for correct Gemini cost calculation.
  • Fixed Prompt Studio history to correctly load tools and system messages.
  • Fixed Gemini `gemini-robotics-er-1.5-preview` entry.
  • Converted image URLs to base64 in tool messages for Anthropic on Vertex AI.
  • Fixed router search tools v2 implementation.
  • Fixed `stream_chunk_builder` to preserve images from streaming chunks.
  • Added `libsndfile` to the main Dockerfile for ARM64 audio processing.
  • Added `datadog_llm_observability` to allowed services in `/health/services`.
  • Prevented provider-prefixed model leaks in the proxy.
  • Routed `hosted_vllm` through `base_llm_http_handler` to support `ssl_verify`.
  • Fixed model map formatting in validation test.
  • Fixed model map path in validation test.
  • Fixed `litellm_fix_robotic_model_map_entry`.
  • Fixed sorting for `/v2/model/info`.
  • Fixed `error_code` in Spend Logs metadata.
  • Fixed CI/CD issues and changed processes during release day.
  • Fixed CI pipeline router coverage failure.
  • Resolved high CPU when `router_settings` is in DB by avoiding `REGISTRY.collect()` in `PrometheusServicesLogger`.
  • Fixed support for file retrieval in GoogleAIStudioFilesHandle for Gemini.
  • Fixed extraction of input tokens details as a dictionary in `ResponseAPILoggingUtils`.
  • Fixed `max_input_tokens` for `gpt-5.2-codex`.
  • Fixed Batch and File user level permissions.
  • Fixed `aspectRatio` mapping in image edit.
  • Fixed `vllm` embedding format.
  • Removed unsupported header `prompt-caching-scope-2026-01-05` for Vertex AI.
  • Added event-driven coordination for global spend query to prevent cache stampede.
  • Ran `prisma generate` as nobody user in non-root container.
  • Added routing of XAI chat completions to responses when web search options are present.
  • Added disable flag for Anthropic Gemini cache translation.
  • Reverted logs view commits.

Affected Symbols