v3.12.0
📦 localaiView on GitHub →
✨ 7 features🐛 16 fixes🔧 5 symbols
Summary
LocalAI 3.12.0 introduces significant multi-modal capabilities, a new Voxtral TTS backend, and multi-GPU support for Diffusers. This release also focuses heavily on stabilizing realtime interactions and patching several security and stability bugs.
Migration Steps
- If you rely on content fetching endpoints that use external URLs, ensure they are properly validated to prevent SSRF, as validation was added.
- Review realtime configurations if you were experiencing issues with voice handling or models without backends.
✨ New Features
- Introduced multi-modal support, allowing text, images, and audio to be sent in conversations.
- Added Voxtral as a new high-quality text-to-speech backend.
- Improved Diffusers performance with experimental support for multi-GPU usage.
- Enhanced compatibility and performance for legacy CPU architectures using the stablediffusion-ggml backend.
- Updated the UI with a new theme (offering dark/light variants) and improved navigation.
- Added experimental support for sd_embed-style prompt embedding in Diffusers.
- Added support for sending text, image, and audio conversation items in realtime interactions.
🐛 Bug Fixes
- Fixed an SSRF vulnerability by validating URLs in content fetching endpoints.
- Resolved issues in realtime handling related to user-provided voice and allowing pipeline models to function without a backend.
- Fixed sampling and websocket locking issues in realtime processing.
- Corrected the sending of proper image data to the backend in realtime.
- Prevented excessive logging during capability detection.
- Pinned setuptools version for voxcpm to avoid build issues.
- Fixed llama-cpp tensor_buft_override buffer population for correct fit calculations.
- Pinned neutts-air to a known working commit.
- Improved watchdown logic.
- Fixed an issue where parameters were not passed correctly when using the embedded template in llama-cpp.
- Improved support for thinking models and setting model parameters in realtime.
- Limited buffer sizes in realtime to prevent potential Denial of Service (DoS) attacks.
- Improved the UI view on mobile devices.
- Fixed an issue where sd_embed was not always available in diffusers.
- Prevented tracking models if they do not exist.
- Fixed an issue where YAML v3 was not used correctly, preventing map merging with incompatible keys in the model gallery.