Change8

v1.82.1-focus-dev

📦 litellmView on GitHub →
🐛 14 fixes🔧 8 symbols

Summary

This release focuses primarily on bug fixes across various components, including routing, caching, provider integrations (Fireworks, Sagemaker, Vertex AI, Bedrock), and the Responses API streaming bridge. Several provider-specific URL and request body issues were resolved.

🐛 Bug Fixes

  • Fixed handling of ResponseApplyPatchToolCall in the completion bridge.
  • Router now breaks the retry loop on non-retryable errors.
  • Fixed invalid OpenAPI schema for /spend/calculate and /credentials endpoints in the proxy.
  • Preserved usage/cached_tokens in Responses API streaming bridge.
  • Injected default_in_memory_ttl in DualCache async_set_cache and async_set_cache_pipeline.
  • Applied server root path to mapped passthrough route matching.
  • Merged parallel function_call items into a single assistant message in Responses API.
  • Handled month overflow in duration_in_seconds calculation for multi-month durations.
  • Used correct divisor when averaging TTFT (Time To First Token) in lowest-latency routing.
  • Stripped duplicate /v1 from the models endpoint URL for Fireworks provider.
  • Added role assumption support for Sagemaker embedding endpoints.
  • Stripped LiteLLM-internal keys from extra_body before merging to Gemini request for Vertex AI.
  • Preserved reasoning_effort summary field for Responses API when using OpenAI provider.
  • Populated completion_tokens_details in Responses API for Bedrock provider.

Affected Symbols