LocalAI

AI & LLMs

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference

Latest: v3.12.130 releases8 breaking changes1 common errorsUpdated Feb 21, 2026View on GitHub

Release History

v3.12.11 fix1 feature

Feb 21, 2026

This is a patch release primarily focused on updating the underlying llama.cpp version to resolve incompatibilities with Qwen 3 coder, alongside adding backend traces and minor fixes.

v3.12.016 fixes7 features

Feb 20, 2026

LocalAI 3.12.0 introduces significant multi-modal capabilities, a new Voxtral TTS backend, and multi-GPU support for Diffusers. This release also focuses heavily on stabilizing realtime interactions and patching several security and stability bugs.

v3.11.0Breaking7 fixes10 features

Feb 7, 2026

LocalAI 3.11.0 is a massive update focused on Audio and Multimodal capabilities, introducing Realtime Audio Conversations and expanding ASR/TTS backends. This release also removes the unmaintained Bark and deprecated ExLlama backends.

v3.10.14 fixes3 features

Jan 23, 2026

This patch release introduces support for Qwen-TTS and includes several bug fixes related to reasoning detection, tracing, and LTX-2 API calls. It also polishes existing features like reasoning blocks.

v3.10.02 fixes11 features

Jan 18, 2026

LocalAI 3.10.0 introduces major enhancements to agent capabilities with full Open Responses API and Anthropic API support, alongside a unified GPU backend system for simplified cross-platform acceleration. This release also adds new features like video generation and faster transcription via the Moonshine backend.

v3.9.0Breaking2 fixes7 features

Dec 24, 2025

LocalAI 3.9.0 introduces significant stability and resource management features, including an Agent Jobs panel for scheduling and an automatic Smart Memory Reclaimer with LRU model eviction. This release also drops x86_64 Mac support and updates data storage paths.

v3.8.06 fixes8 features

Nov 26, 2025

LocalAI 3.8.0 introduces a major focus on user experience with a universal model importer, a complete UI overhaul, and hot-reloadable system settings. It also significantly improves agentic workflows via live streaming and fixes critical OpenAI SSE compatibility issues.

v3.7.07 fixes10 features

Oct 31, 2025

LocalAI 3.7.0 introduces robust Agentic MCP support integrated into the WebUI and a new high-quality neutts TTS backend. This release also features a major WebUI overhaul and numerous stability fixes, particularly around tool usage and JSON schema handling.

v3.6.01 fix2 features

Oct 3, 2025

This release introduces support for l4t devices and multilingual capabilities in chatterbox, alongside a fix for token limits in llama.cpp reranking models. Numerous dependency updates, particularly for llama.cpp and whisper.cpp, were also performed.

v3.5.41 fix

Sep 20, 2025

This release primarily focuses on internal dependency updates for whisper.cpp and llama.cpp, along with a bug fix to standardize option checking across backends.

v3.5.31 fix3 features

Sep 19, 2025

This release introduces several new models to the gallery and fixes a bug related to float detection in diffusers. It also updates the underlying llama.cpp dependency.

v3.5.2

Sep 18, 2025

This release reverts a dependency bump for CUDA images and updates the underlying llama.cpp commit, alongside documentation version updates.

v3.5.17 fixes4 features

Sep 17, 2025

This release introduces a new launcher welcome page and support for the HF_ENDPOINT environment variable. It also includes numerous bug fixes, dependency updates, and adds support for diarization in Whisper models.

v3.5.05 fixes14 features

Sep 3, 2025

LocalAI 3.5.0 expands backend support significantly with MLX, Purego rewrites for Whisper and Stablediffusion, and introduces an Alpha Launcher for easier management. This release also brings numerous WebUI enhancements and stability fixes across various hardware platforms.

v3.4.01 fix7 features

Aug 12, 2025

LocalAI 3.4.0 introduces significant expansion with new TTS and AI backends (KittenTTS, kokoro, dia), enhanced WebUI image generation controls, and support for reasoning effort in chat completions. It also improves backend installation flexibility and fixes a llama.cpp rope defaulting issue.

v3.3.21 fix2 features

Aug 4, 2025

This release introduces new flexibility for backend installation and configuration, alongside dependency updates and documentation improvements.

v3.3.1Breaking1 fix3 features

Aug 1, 2025

This minor release introduces support for Flux Kontext and Flux krea for image editing and generation. It also resolves critical bugs related to Intel GPU images and renames the corresponding container image tags.

v3.3.02 fixes3 features

Jul 28, 2025

LocalAI 3.3.0 introduces object detection capabilities via a new API and improves backend download reliability with defined mirrors. This release also includes various bug fixes across container images, backends, and installation scripts.

v3.2.31 fix

Jul 26, 2025

This release primarily focuses on bug fixes, specifically addressing CUDA image tag naming, and includes an update to the ggml-org/llama.cpp dependency. Documentation for backend detection override was also added.

v3.2.23 fixes1 feature

Jul 25, 2025

This release introduces mirror support for the backend gallery and includes several bug fixes related to string trimming, Vulkan image suffixes, and CI image capabilities.

v3.2.12 fixes

Jul 25, 2025

This patch release primarily focuses on bug fixes related to installation and backend service communication, alongside updating several underlying C++ dependencies.

v3.2.0Breaking1 fix8 features

Jul 24, 2025

LocalAI 3.2.0 introduces a major architectural shift by separating inference backends from the core binary, resulting in a leaner application and enabling independent backend management. This release also adds automatic hardware detection for backend installation and expands model support significantly.

v3.1.14 fixes1 feature

Jun 27, 2025

This patch release introduces automatic backend installation for models in the gallery and resolves several bugs related to GPU vendor identification and URI handling in the backends gallery.

v3.1.0Breaking1 fix3 features

Jun 26, 2025

LocalAI 3.1 introduces support for Gemma 3n models and streamlines the container image structure by removing bundled sources, significantly reducing image size. This release also features meta-packages for easier backend installation and highlights the integrated LocalAI ecosystem.

v3.0.0Breaking3 fixes9 features

Jun 19, 2025

LocalAI 3.0 introduces a major overhaul with the Backend Gallery for dynamic OCI-based backend management, adds WebSocket streaming, and enhances model capabilities with dynamic VRAM handling and multimodal support. This release marks a significant step towards a more modular and powerful local inference platform.

v2.29.0Breaking6 features

May 12, 2025

LocalAI v2.29.0 introduces a major overhaul to container images, slimming down defaults and introducing an \`-extras\` suffix for optional dependencies, alongside new model support and experimental video generation capabilities.

v2.28.02 fixes5 features

Apr 15, 2025

LocalAI v2.28.0 introduces SYCL support for image generation, updates the diffusers library, and features significant enhancements around the newly reborn LocalAGI agent framework, which is now written in Go.

v2.27.07 fixes11 features

Mar 31, 2025

LocalAI v2.27.0 introduces a major WebUI overhaul, significant performance enhancements for GGUF and VLLM, and updates the default models shipped in AIO container images.

v2.26.0Breaking5 fixes9 features

Feb 15, 2025

LocalAI v2.26.0 introduces significant backend consolidation, dropping deprecated TTS and LLM backends in favor of modern alternatives like GGUF and new TTS options. This release also adds support for Nvidia L4T devices and introduces lazy grammar features for llama.cpp.

v2.25.02 fixes6 features

Jan 10, 2025

This release introduces several new features, including streaming token usage, GGUF template reading, and enhanced llama.cpp cache configuration. It also includes numerous updates to the model gallery and a fix for the llava clip patch.

Common Errors

RepositoryNotFoundError1 report

"RepositoryNotFoundError" in LocalAI usually means the requested model path or name isn't found locally or remotely. To fix it, verify the model path in your configuration (e.g., `models.yaml`), ensure the model is either downloaded to that path or that the correct HF_API_KEY is used and accessible to fetch it from the Hugging Face Hub, and double check for typos. If you're using a custom path, verify that LocalAI has the necessary permissions to access it.

Related AI & LLMs Packages

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Ollama

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

LangChain

🦜🔗 The platform for reliable agents.

ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

llama.cpp

LLM inference in C/C++

GPT4All

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.