Change8

Ollama

AI & LLMs

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Latest: v0.30.5-rc0100 releases8 breaking changes1 common errorsView on GitHub

Release History

v0.30.5-rc01 fix1 feature
17h ago

This release introduces documentation for Cline CLI integration and fixes an issue with Hermes installation on Windows, alongside an update to the underlying llama.cpp version.

v0.30.51 fix
17h ago

This release addresses a critical crash related to the gemma4:12b model and includes an integration fix for Hermes on Windows.

v0.30.41 fix
Jun 3, 2026

This release includes an update to the underlying llama.cpp version and fixes an issue related to cleaning up the llama-server process on Windows.

v0.30.4-rc01 fix
Jun 3, 2026

This release updates the underlying llama.cpp version and includes a fix for properly terminating the llama-server process on Windows during cleanup.

v0.30.31 feature
Jun 3, 2026

This release introduces support for the new gemma4-12b model within the models module.

v0.30.2-rc07 fixes3 features
Jun 3, 2026

This release introduces support for Cline CLI and Qwen integration, alongside various stability improvements and fixes related to llama-server and model loading.

v0.30.27 fixes3 features
Jun 3, 2026

This release introduces support for Cline CLI and Qwen integration, alongside numerous stability improvements and fixes related to llama-server and model loading.

v0.30.1-rc04 fixes3 features
Jun 2, 2026

This release introduces new features like Cline CLI support and Qwen code integration, alongside several bug fixes related to model limits, server counts, and markdown handling. It also updates the underlying llama.cpp version.

v0.24.0-rc12 features
May 14, 2026

This release introduces memory trace logging for MLX and integrates Codex application support via the launch mechanism.

v0.24.0-rc02 features
May 14, 2026

This release introduces memory trace logging for MLX and integrates Codex application support into the launch mechanism.

v0.24.01 fix4 features
May 14, 2026

The OpenAI Codex App is now available on Ollama, integrating local and cloud models for coding workflows. This release also introduces a built-in browser and code review mode, alongside MLX sampler improvements for Apple Silicon.

v0.23.4-rc01 fix1 feature
May 13, 2026

This release introduces support for vision models with image inputs when launching opencode via ollama and fixes an issue with Claude tool result formatting for local image paths.

v0.23.41 fix1 feature
May 13, 2026

This release introduces support for vision models with image inputs when launching opencode via ollama and fixes an issue with Claude tool result formatting for local image paths.

v0.30.0-rc15Breaking3 features
May 13, 2026

This pre-release updates the architecture to directly support llama.cpp, enabling GGUF compatibility and leveraging MLX for Apple Silicon acceleration. Users are encouraged to test performance and stability.

v0.30.0Breaking5 features
May 13, 2026

Ollama 0.30 introduces significant performance and compatibility improvements by integrating llama.cpp, broadening hardware support, and adding new model capabilities.

v0.30.0-rc32Breaking1 fix3 features
May 13, 2026

This release transitions Ollama's architecture to directly support llama.cpp, enabling GGUF compatibility and MLX acceleration on Apple Silicon. A known issue is that `nomic-embed-text` now enforces lowercase input.

v0.30.0-rc31Breaking1 fix3 features
May 13, 2026

This release shifts the core architecture to directly support llama.cpp and the GGUF format, introduces MLX acceleration for Apple Silicon, and fixes case handling for the nomic-embed-text model.

v0.30.0-rc273 features
May 13, 2026

This pre-release updates the architecture to directly support llama.cpp and the GGUF format, while introducing MLX acceleration for Apple Silicon inference. Feedback is requested on performance and stability.

v0.30.0-rc23Breaking3 features
May 13, 2026

This pre-release version overhauls the architecture to directly support llama.cpp, enabling GGUF compatibility and leveraging MLX for Apple Silicon acceleration. Users are encouraged to provide feedback on performance and stability.

v0.30.0-rc22Breaking3 features
May 13, 2026

This pre-release updates the core architecture to directly support llama.cpp, enabling GGUF compatibility and leveraging MLX for Apple Silicon acceleration. Users are encouraged to provide feedback on performance and stability.

v0.30.0-rc21Breaking3 features
May 13, 2026

This pre-release version overhauls the architecture to use llama.cpp directly, enabling GGUF support and leveraging MLX for Apple Silicon acceleration. Users are encouraged to provide feedback on performance and stability.

v0.30.0-rc203 features
May 13, 2026

This pre-release updates the architecture to directly support llama.cpp and the GGUF format, while introducing MLX acceleration for Apple Silicon inference.

v0.30.0-rc173 features
May 13, 2026

This pre-release updates the architecture to directly support llama.cpp, enabling GGUF compatibility and utilizing MLX for Apple Silicon acceleration. Feedback is requested on performance and memory utilization.

v0.23.3-rc1
May 12, 2026

This release focuses on hardening update flows and refining model push behavior within the MLX integration.

v0.23.32 fixes
May 12, 2026

This release focuses on stability and improvements within the MLX backend, including refined model pushing and fixes for inference timeouts and metallib leakage.

v0.23.2-rc02 features
May 7, 2026

This release focuses on improvements to the Ollama server caching and the desktop launch experience, including plan-aware model gating and disabling Claude Desktop launch.

v0.23.2Breaking3 features
May 7, 2026

This release introduces significant performance improvements via API response caching and refines the launch integration management workflow. The default behavior of `ollama launch` has been updated regarding Claude Desktop integration.

v0.23.11 fix1 feature
May 5, 2026

This release introduces Gemma 4 MTP speculative decoding support for Macs, significantly boosting performance for the Gemma 4 31B model, alongside general threading fixes and a Go version bump.

v0.23.1-rc01 fix1 feature
May 5, 2026

This release introduces Gemma 4 MTP speculative decoding support for Macs, significantly boosting performance for the Gemma 4 31B model on coding tasks, alongside underlying threading fixes and a Go version bump.

v0.23.0-rc01 fix2 features
May 3, 2026

This release introduces new features like sourcing featured models from an experimental endpoint and adds launch support for the Claude application. It also includes a fix for OpenClaw gateway timeouts on Windows.

v0.23.02 fixes3 features
May 3, 2026

This release introduces support for Claude Desktop integration with Ollama and enhances stability by fixing Windows gateway timeouts and hardening Metal initialization.

v0.22.12 fixes2 features
Apr 28, 2026

This release introduces model batching support and TensorRT Model Optimizer import for the mlx backend. It also includes several bug fixes related to tokenization and desktop application startup behavior.

v0.22.1-rc02 fixes2 features
Apr 28, 2026

This release introduces model batching support and fixes several issues related to tokenization and desktop application startup behavior. It also includes support for NVIDIA TensorRT Model Optimizer import.

v0.22.1-rc12 fixes2 features
Apr 28, 2026

This release introduces model batching support and adds NVIDIA TensorRT Model Optimizer import capability. Several minor bugs related to tokenization and desktop app session handling were also resolved.

v0.22.0-rc11 fix1 feature
Apr 28, 2026

This release introduces support for NVIDIA TensorRT Model Optimizer import within mlx and fixes an issue related to multi-regex BPE offset handling in the tokenizer. It also includes performance improvements by batching the sampler across multiple sequences in mlxrunner.

v0.22.02 features
Apr 28, 2026

This release introduces two new models: NVIDIA's Nemotron 3 Omni and Poolside's Laguna XS.2.

v0.21.3-rc01 fix1 feature
Apr 24, 2026

This release introduces flexibility in the API by allowing "max" for the think parameter and improves OpenAI response mapping for reasoning effort.

v0.21.22 features
Apr 23, 2026

This release introduces structured outputs and ollama cloud support, alongside updating the web search mechanism to use bundled OpenClaw.

v0.21.2-rc02 features
Apr 23, 2026

This release introduces structured outputs and ollama cloud support, alongside updating the web search mechanism to use bundled OpenClaw.

v0.21.13 fixes1 feature
Apr 22, 2026

This release introduces kimi CLI integration and includes several performance and correctness fixes for MLX models and server formatting logic.

v0.21.1-rc13 fixes1 feature
Apr 22, 2026

This release introduces kimi CLI integration and includes several performance and correctness fixes across MLX models and server formatting logic.

v0.21.02 fixes2 features
Apr 16, 2026

This release introduces Copilot CLI integration and support for the hermes model within the launch command, alongside several fixes to configuration handling during launch.

v0.21.0-rc12 fixes2 features
Apr 16, 2026

This release introduces Copilot CLI integration and support for the hermes model within the launch command, alongside several fixes related to launch configuration handling.

v0.20.8-rc03 fixes3 features
Apr 14, 2026

This release introduces Gemma4 support on the MLX backend and updates the ROCm version to 7.2.1 on Linux. It also includes various fixes and improvements for MLX operations and Gemma4 rendering.

v0.20.71 fix
Apr 13, 2026

This release primarily updates the ROCm dependency to version 7.2.1 on Linux and fixes a quality regression in specific Gemma model configurations.

v0.20.61 fix2 features
Apr 12, 2026

This release focuses on improving Gemma 4 and parallel tool calling capabilities, alongside general application bug fixes and documentation updates for the Hermes Agent.

v0.20.6-rc14 fixes1 feature
Apr 10, 2026

This release introduces documentation for Hermes agent integration and includes several bug fixes related to model parsing, Gemma4 handling, and UI validation upon model switching.

v0.20.6-rc02 fixes
Apr 10, 2026

This release includes documentation updates, fixes for parallel tool call indexing, and UI adjustments for image attachment validation upon model change. The Gemma4 renderer was also updated.

v0.20.5-rc14 fixes2 features
Apr 9, 2026

This release focuses on improving the command line interface, refining launch configurations for specific models (glm-5.1, gemma4), and enhancing setup for openclaw and opencode integration. It also includes several minor bug fixes across the application.

v0.20.51 fix3 features
Apr 9, 2026

This release introduces OpenClaw channel setup for integrating messaging platforms like WhatsApp and Telegram via `ollama launch openclaw`, enables flash attention for Gemma 4, and fixes a bug in the /save command.

v0.20.5-rc02 fixes2 features
Apr 9, 2026

This release introduces setup for openclaw channels and improves command-line interaction, alongside refining safetensors handling and error reporting.

v0.20.5-rc21 fix3 features
Apr 9, 2026

This release introduces OpenClaw channel setup via `ollama launch openclaw` and enables flash attention and tool call repair for Gemma 4 models. A bug fix was also implemented for the `/save` command.

v0.20.4-rc21 fix2 features
Apr 7, 2026

This release focuses on performance improvements for MLX (M5 with NAX) and Gemma4 (flash attention), alongside minor fixes for model creation.

v0.20.4-rc12 fixes2 features
Apr 7, 2026

This release focuses on performance improvements for MLX (M5 with NAX) and Gemma4 (flash attention), alongside fixes for model creation paths and safetensor loading.

v0.20.42 features
Apr 7, 2026

This release focuses on performance improvements for M5 models via NAX integration and enables flash attention support for gemma4.

v0.20.31 fix2 features
Apr 7, 2026

This release includes improvements to Gemma 4 Tool Calling, adds the latest models to the Ollama App, and fixes issues with launching the OpenClaw TUI.

v0.20.21 feature
Apr 4, 2026

This release updates the default application home view to use the new chat interface by default. It includes minor changes related to the application's user interface.

v0.20.14 fixes1 feature
Apr 3, 2026

This patch release introduces new benchmarking capabilities and resolves several parsing and build issues related to gemma4 and ROCm builds.

v0.20.1-rc24 fixes2 features
Apr 3, 2026

This release introduces performance improvements via flash attention for gemma4 and fixes several parsing and build issues related to argument handling and ROCm compilation.

v0.20.01 fix2 features
Apr 2, 2026

This release introduces the new Gemma 4 model family variants (E2B, E4B, 26B, 31B) and enhances tokenizer capabilities with SentencePiece-style BPE support.

v0.20.0-rc11 fix2 features
Apr 2, 2026

This release introduces the new Gemma 4 model family variants (E2B, E4B, 26B, 31B) and enhances tokenizer capabilities with SentencePiece-style BPE support.

v0.19.0-rc03 fixes2 features
Mar 27, 2026

This release introduces changes to the launch command behavior, improves VS Code path detection, and includes several CI/build hardening updates.

v0.19.0-rc13 fixes2 features
Mar 27, 2026

This release introduces changes to the launch command behavior, updates the TUI title handling, and improves CI build processes for MLX and CUDA.

v0.19.04 fixes4 features
Mar 27, 2026

This release introduces improvements to KV cache handling, adds a web search plugin to `ollama launch pi`, and resolves several model loading and parsing bugs across different architectures.

v0.19.0-rc23 fixes1 feature
Mar 27, 2026

This release introduces a warning for small context lengths and improves launch logic for VS Code integration, alongside various CI and TUI updates.

v0.18.4-rc02 fixes
Mar 26, 2026

This release focuses on stability improvements, including fixing a memory leak in mlx and adjusting settings for the Grok model on ggml. It also updates VS Code documentation and hides the VS Code launch option.

v0.18.3-rc13 fixes3 features
Mar 25, 2026

This release introduces debug request logging and improves MLX performance with better cache sharing and new format imports. Several stability fixes were also implemented across the desktop app, MLX runner, and CI.

v0.18.35 fixes4 features
Mar 25, 2026

This release introduces debug request logging, improves KV cache sharing in mlxrunner, and fixes several stability issues including desktop app loading hangs and mlxrunner deadlocks.

v0.18.22 fixes2 features
Mar 18, 2026

This release introduces checks for npm and git installation prerequisites for OpenClaw and significantly speeds up local Claude Code execution. Several minor bugs related to model launching and package registration were also fixed.

v0.18.2-rc01 fix3 features
Mar 18, 2026

This release introduces significant performance and feature enhancements for the MLX backend, including model eviction, quantized embeddings, and fast SwiGLU. It also includes a fix for the web_search legacy path in the cloud proxy.

v0.18.2-rc11 fix3 features
Mar 18, 2026

This release introduces significant performance and feature enhancements for MLX backend, including model eviction, quantized embeddings, and fast SwiGLU. It also includes a fix for the web_search legacy path in the cloud proxy.

v0.18.12 fixes1 feature
Mar 17, 2026

This release focuses on improving the benchmarking tool and adding stability fixes for launch commands, particularly concerning headless mode and systemd availability.

v0.18.1-rc12 fixes1 feature
Mar 17, 2026

This release focuses on improving the benchmarking tool and adding stability fixes for launch commands, particularly concerning headless mode and systemd availability.

v0.18.03 fixes2 features
Mar 14, 2026

This release introduces documentation for `reasoning_effort` support in the OpenAI-compatible API and includes several fixes related to cloud model handling and launch command integration.

v0.18.0-rc23 fixes1 feature
Mar 14, 2026

This release introduces documentation for reasoning_effort support in the OpenAI-compatible API and fixes several issues related to cloud model handling and launch command integration.

v0.17.8-rc48 fixes2 features
Mar 10, 2026

This release focuses on stability and performance improvements, including fixes for GLM tool calls, localhost handling, and updates to MLX and ROCm support. It also addresses an issue where resetting defaults disabled auto-updates.

v0.17.8-rc15 fixes1 feature
Mar 10, 2026

This release focuses on stability and fixes, including repairs to GLM tool parsing, localhost handling, and cloud proxy stream disconnects, alongside build improvements for Windows and MLX.

v0.17.8-rc27 fixes2 features
Mar 10, 2026

This release focuses on stability and performance improvements, including fixes for GLM tool parsing, localhost handling, and updates to MLX and ROCm support. It also refactors the MLX runner sampler interface.

v0.17.8-rc37 fixes2 features
Mar 10, 2026

This release focuses on stability and performance improvements across parsers, cloud proxy handling, and MLX backend optimizations. It also includes fixes for Docker builds and application defaults.

v0.17.7-rc22 features
Mar 5, 2026

This release introduces improvements to thinking level mapping and adds context length support for compaction via `ollama launch`.

v0.17.7-rc02 features
Mar 5, 2026

This release loosens the server's thinking level constraint and adds support for Qwen 3.5 context length during launch.

v0.17.72 features
Mar 5, 2026

This release introduces improvements to thinking level mapping and adds context length support for compaction via `ollama launch`.

v0.17.62 fixes
Mar 4, 2026

This release focuses on bug fixes, specifically addressing prompt rendering issues for GLM-OCR and improving tool calling for Qwen 3.5 models.

v0.17.54 fixes1 feature
Mar 2, 2026

This release focuses on stability and performance improvements for Qwen 3.5 models, particularly when running across multiple devices or using the MLX engine, and introduces peak memory reporting.

v0.17.41 feature
Feb 27, 2026

This release introduces the inclusion of tool call indices within parallel tool calls for enhanced tracking and functionality.

v0.17.31 fix
Feb 27, 2026

This patch release fixes a bug related to the correct parsing of tool calls for Qwen 3 and Qwen 3.5 models when they are emitted during the thinking process.

v0.17.21 fix
Feb 26, 2026

This release addresses a critical bug where the Windows application would crash on startup if an update was pending.

v0.17.1-rc02 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations.

v0.17.13 fixes3 features
Feb 24, 2026

This release introduces support for the Nemotron architecture and includes several performance and stability improvements, particularly around MLX memory usage and logging. It also updates the mlx-c bindings.

v0.17.1-rc22 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations. It also updates underlying MLX-C bindings.

v0.17.1-rc12 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations. It also updates underlying MLX-C bindings.

v0.17.0-rc12 features
Feb 21, 2026

This release introduces UI exposure of the server context length and implements OpenClaw onboarding, alongside internal consolidation of the tokenizer.

v0.17.02 features
Feb 21, 2026

This release introduces automatic installation and configuration of OpenClaw via Ollama, enabling easier use with open models, and enables websearch functionality when using cloud models.

v0.16.33 fixes5 features
Feb 19, 2026

This release introduces support for several new model architectures (Gemma 3, Llama 3, Qwen 3) in mlxrunner and adds the new `ollama launch` CLI command. Several minor bug fixes related to mlx model display and scheduling were also implemented.

v0.16.22 fixes3 features
Feb 14, 2026

This release introduces the ability to disable cloud models via a new setting or environment variable, and fixes rendering issues in PowerShell along with bugs affecting experimental image models.

v0.16.2-rc02 fixes2 features
Feb 14, 2026

This release introduces web search capabilities for Claude cloud models and adds an environment variable to easily disable cloud models for privacy. It also fixes rendering issues in PowerShell and restores functionality for experimental image generation models.

v0.16.12 fixes1 feature
Feb 12, 2026

This release improves the installation experience on macOS and Windows and adds support for respecting the OLLAMA_LOAD_TIMEOUT variable for image generation models.

v0.16.01 fix3 features
Feb 12, 2026

Ollama 0.16.0-rc2 introduces the powerful GLM-5 model and a new `ollama` command for simplified application launching. It also includes MLX runner improvements and a new keybinding for prompt editing.

v0.16.0-rc21 fix5 features
Feb 12, 2026

This release introduces significant improvements to the command-line interface (CLI) and Text User Interface (TUI) experience, adds MLX runner support with safetensors quantization, and includes new login/logout aliases.

v0.16.0-rc11 fix11 features
Feb 12, 2026

This release introduces significant UX improvements across the CLI and TUI, adds new features like external prompt editing and hidden login/logout aliases, and enhances model support with MLX integration and safetensors quantization.

Common Errors

Related AI & LLMs Packages

Subscribe to Updates

Get notified when new versions are released

RSS Feed