v2.27.0
📦 localaiView on GitHub →
✨ 11 features🐛 7 fixes🔧 4 symbols
Summary
LocalAI v2.27.0 introduces a major WebUI overhaul, significant performance enhancements for GGUF and VLLM, and updates the default models shipped in AIO container images.
Migration Steps
- If using AIO images, note the new default models included in v2.27.0.
- Review Docker run commands if you rely on specific image tags; 'latest' and 'latest-aio-cpu' have updated defaults.
✨ New Features
- Complete WebUI Redesign with a fresh, modern interface and enhanced navigation.
- Model Gallery Improvements including better pagination and filtering.
- AIO Image Updates with new default models shipped (e.g., llama3.1, granite-embeddings, minicpm for CPU).
- Chat Interface Enhancements: cleaner layout, model-specific UI tweaks, and custom reply prefixes.
- Smart Model Detection: Automatically links to relevant model documentation based on use.
- Performance Tweaks: GGUF models now auto-detect context size.
- Llama.cpp now handles batch embeddings and SIGTERM gracefully.
- VLLM Configuration Boost: Added options to disable logging, set dtype, and enforce per-prompt media limits.
- Support for new model architectures: Gemma 3, Mistral, Deepseek.
- Ability to specify a reply prefix for chat responses.
- UI improvements to index and models page.
🐛 Bug Fixes
- Resolved model icon display inconsistencies.
- Ensured proper handling of generated artifacts without API key restrictions.
- Optimized CLIP offloading (no longer implies GPU offload by default).
- Fixed Llama.cpp process termination handling (SIGTERM).
- Fixed initialization order of llama-cpp-avx512 to precede avx2 variant.
- Pinned transformers version for coqui fix.
- Unified usecases identifications for models.
🔧 Affected Symbols
llama.cppCLIPcoquitransformers