v3.11.0
Breaking Changes📦 localaiView on GitHub →
⚠ 2 breaking✨ 10 features🐛 7 fixes🔧 12 symbols
Summary
LocalAI 3.11.0 is a massive update focused on Audio and Multimodal capabilities, introducing Realtime Audio Conversations and expanding ASR/TTS backends. This release also removes the unmaintained Bark and deprecated ExLlama backends.
⚠️ Breaking Changes
- The ExLlama backend has been removed because it is deprecated in favor of newer loaders like ExLlamaV2 or llama.cpp.
- The Bark backend has been removed because the upstream project is unmaintained; users should switch to the new TTS alternatives.
Migration Steps
- If you were using the ExLlama backend, migrate to ExLlamaV2 or llama.cpp.
- If you were using the Bark TTS backend, migrate to one of the new TTS alternatives (e.g., VoxCPM, Qwen-TTS, Piper).
✨ New Features
- Introduced native support for Realtime Audio Conversations, enabling fluid, low-latency voice interaction compatible with standard client implementations.
- Added a dedicated Web UI interface for music generation using the new Ace-Step (MusicGen) backend.
- Expanded ASR capabilities with four new backends: WhisperX (with Speaker Diarization), VibeVoice, Qwen-ASR, and Nvidia NeMo.
- Text-to-Speech (TTS) now supports streaming mode for lower latency responses (currently for VoxCPM only).
- Added support for the vLLM Omni backend for high-performance inference.
- Native support for Speaker Diarization (identifying different speakers) via the WhisperX backend.
- Expanded build support for CUDA 12/13, L4T (Jetson), SBSA, and improved Metal (Apple Silicon) integration using MLX backends.
- Added support for the VoxCPM TTS backend.
- Added support for Qwen-TTS models.
- Added most remaining Piper voices from Hugging Face to the gallery.
🐛 Bug Fixes
- Fixed UI issue where the selected image model was not displayed correctly.
- Fixed token count calculation to correctly account for reasoning in the UI.
- Dropped redundant GGUF VRAM estimation logic, relying on more accurate internal measurements.
- Fixed missing field in the initial OpenAI streaming response.
- Fixed realtime audio handling to include the noAction function in the prompt template and correctly handle tool_choice.
- Fixed filtering of GGUF and GGML files from the model list.
- Fixed Makefile issue by removing contagious slop (DEFAULT_GOAL) related to qwen-asr.