v3.12.0

📅 Feb 20, 2026📦 localaiView on GitHub →

✨ 7 features🐛 16 fixes🔧 5 symbols

Summary

LocalAI 3.12.0 introduces significant multi-modal capabilities, a new Voxtral TTS backend, and multi-GPU support for Diffusers. This release also focuses heavily on stabilizing realtime interactions and patching several security and stability bugs.

Migration Steps

If you rely on content fetching endpoints that use external URLs, ensure they are properly validated to prevent SSRF, as validation was added.
Review realtime configurations if you were experiencing issues with voice handling or models without backends.

✨ New Features

Introduced multi-modal support, allowing text, images, and audio to be sent in conversations.
Added Voxtral as a new high-quality text-to-speech backend.
Improved Diffusers performance with experimental support for multi-GPU usage.
Enhanced compatibility and performance for legacy CPU architectures using the stablediffusion-ggml backend.
Updated the UI with a new theme (offering dark/light variants) and improved navigation.
Added experimental support for sd_embed-style prompt embedding in Diffusers.
Added support for sending text, image, and audio conversation items in realtime interactions.

🐛 Bug Fixes

Fixed an SSRF vulnerability by validating URLs in content fetching endpoints.
Resolved issues in realtime handling related to user-provided voice and allowing pipeline models to function without a backend.
Fixed sampling and websocket locking issues in realtime processing.
Corrected the sending of proper image data to the backend in realtime.
Prevented excessive logging during capability detection.
Pinned setuptools version for voxcpm to avoid build issues.
Fixed llama-cpp tensor_buft_override buffer population for correct fit calculations.
Pinned neutts-air to a known working commit.
Improved watchdown logic.
Fixed an issue where parameters were not passed correctly when using the embedded template in llama-cpp.
Improved support for thinking models and setting model parameters in realtime.
Limited buffer sizes in realtime to prevent potential Denial of Service (DoS) attacks.
Improved the UI view on mobile devices.
Fixed an issue where sd_embed was not always available in diffusers.
Prevented tracking models if they do not exist.
Fixed an issue where YAML v3 was not used correctly, preventing map merging with incompatible keys in the model gallery.

Affected Symbols

stablediffusion-ggml backend llama-cpp neutts-air diffusers watchdown