Change8

v2.26.0

Breaking Changes
📦 localaiView on GitHub →
5 breaking9 features🐛 5 fixes🔧 9 symbols

Summary

LocalAI v2.26.0 introduces significant backend consolidation, dropping deprecated TTS and LLM backends in favor of modern alternatives like GGUF and new TTS options. This release also adds support for Nvidia L4T devices and introduces lazy grammar features for llama.cpp.

⚠️ Breaking Changes

  • Several backends have been dropped and replaced for improved performance and compatibility.
  • `Vall-e-x` and `Openvoice` backends have been dropped as they are superseded by Kokoro and OutelTTS.
  • The `stablediffusion-NCN` backend was replaced with the `stablediffusion-ggml` implementation.
  • The deprecated `llama-ggml` backend has been dropped; LocalAI now only supports GGUF models.
  • Mamba, Transformers-Musicgen, and Sentencetransformers functionalities have been merged into the `transformers` backend. Configuration files referencing these old backend names might need updating to use `transformers`.

Migration Steps

  1. If you were using `Vall-e-x` or `Openvoice`, migrate to the Kokoro or OutelTTS backends.
  2. If you were using `stablediffusion-NCN`, switch to using `stablediffusion-ggml`.
  3. If you were using the deprecated `llama-ggml` backend, ensure your models are in GGUF format and update configurations to use the GGUF support.
  4. Review configurations for Mamba, Transformers-Musicgen, and Sentencetransformers; they may need to be aliased or updated to use the `transformers` backend explicitly.

✨ New Features

  • Added support for Nvidia L4T devices (e.g., Nvidia AGX Orin) via specific container images.
  • Introduced lazy grammar triggers for llama.cpp, enabling precise JSON generation based on token triggers.
  • Added function argument parsing using named regular expressions.
  • New TTS backends added: Kokoro and OuteTTS (with voice cloning).
  • New backend added: Fast-Whisper for faster whisper model inference.
  • Diffusers updated to support Sana pipelines and image generation option overrides.
  • Added Machine Tag and Inference Timing for performance tracking.
  • Introduced tokenization support for llama.cpp.
  • Bundled support for CPUs supporting the AVX512 instruction set.

🐛 Bug Fixes

  • Multiple fixes to improve stability.
  • Enabled SYCL support for `stablediffusion-ggml`.
  • Fixed consistent OpenAI stop reason returns.
  • Improved context shift handling for llama.cpp.
  • Fixed gallery store overrides (not returning overrides and additional config).

🔧 Affected Symbols

vall-e-xOpenvoicestablediffusion-NCNstablediffusion-ggmlllama-ggmlMambaTransformers-MusicgenSentencetransformerstransformers