Change8

v3.7.0

📦 localaiView on GitHub →
10 features🐛 7 fixes🔧 6 symbols

Summary

LocalAI 3.7.0 introduces robust Agentic MCP support integrated into the WebUI and a new high-quality neutts TTS backend. This release also features a major WebUI overhaul and numerous stability fixes, particularly around tool usage and JSON schema handling.

Migration Steps

  1. If you use the llama.cpp backend, upgrading the LocalAI installation is necessary as llama.cpp has been updated to the latest version to support Qwen 3 VL.

✨ New Features

  • Introduced Agentic MCP support with full WebUI integration, allowing agents to use external tools like web search and code execution via the standard /v1/chat/completions endpoint.
  • Added a brand-new neutts TTS backend powered by Neuphonic for high-quality, low-latency speech generation.
  • Implemented long-form TTS chunking for the chatterbox use case to generate natural-sounding long audio by intelligently splitting text.
  • Complete overhaul of the WebUI, resulting in a faster, cleaner interface with real-time updates and full YAML model control.
  • Added support for the OpenAI-compatible /v1/videos endpoint for text-to-video generation.
  • Enhanced Whisper.cpp compatibility by providing optimized CPU variants (avx, avx2, avx512, fallback) to prevent 'illegal instruction' crashes.
  • Implemented fuzzy and case-insensitive model search in the gallery (e.g., searching for 'gema' finds 'gemma').
  • Enabled direct import, editing, and deletion of models via clean YAML configuration in the WebUI.
  • Added support for Qwen 3 VL models via llama.cpp/gguf.
  • Introduced an autonomous CI agent that scans Hugging Face for new models and automatically suggests updates to the gallery via PRs.

🐛 Bug Fixes

  • Fixed critical crashes, deadlocks, session events, and JSON schema panics related to OpenAI compliance.
  • Fixed WebUI crash on model load caused by 'can't evaluate field Name in type string' error when models lacked config files.
  • Guarded against race conditions during model loading/idle checks, resolving deadlocks under load.
  • Fixed issues where tools with missing or empty parameters were not handled gracefully.
  • Ensured that when 'strict_mode: true' is set, the model is forced to select a tool, preventing silent skips.
  • Fixed crashes related to handling nullable JSON schemas (e.g., '"type": ["string", "null"]').
  • Ensured that strict mode and grammar rules interact correctly, returning clear JSON errors instead of crashing when a required tool definition is invalid.

🔧 Affected Symbols

whisper.cpp (CPU variants)llama.cpp (updated for Qwen 3 VL)WebUI (overhauled)POST /mcp/v1/chat/completions (new endpoint)POST /v1/videos (new endpoint)Chat view (MCP toggle added)