Change8

Ollama

AI & LLMs

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Latest: v0.17.4100 releases7 breaking changes1 common errorsView on GitHub

Release History

v0.17.41 feature
Feb 27, 2026

This release introduces the inclusion of tool call indices within parallel tool calls for enhanced tracking and functionality.

v0.17.31 fix
Feb 27, 2026

This patch release fixes a bug related to the correct parsing of tool calls for Qwen 3 and Qwen 3.5 models when they are emitted during the thinking process.

v0.17.21 fix
Feb 26, 2026

This release addresses a critical bug where the Windows application would crash on startup if an update was pending.

v0.17.13 fixes3 features
Feb 24, 2026

This release introduces support for the Nemotron architecture and includes several performance and stability improvements, particularly around MLX memory usage and logging. It also updates the mlx-c bindings.

v0.17.1-rc02 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations.

v0.17.1-rc12 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations. It also updates underlying MLX-C bindings.

v0.17.1-rc22 fixes3 features
Feb 24, 2026

This release introduces support for the nemotron architecture and includes several performance and logging improvements, particularly for MLX-based operations. It also updates underlying MLX-C bindings.

v0.17.02 features
Feb 21, 2026

This release introduces automatic installation and configuration of OpenClaw via Ollama, enabling easier use with open models, and enables websearch functionality when using cloud models.

v0.17.0-rc12 features
Feb 21, 2026

This release introduces UI exposure of the server context length and implements OpenClaw onboarding, alongside internal consolidation of the tokenizer.

v0.16.33 fixes5 features
Feb 19, 2026

This release introduces support for several new model architectures (Gemma 3, Llama 3, Qwen 3) in mlxrunner and adds the new `ollama launch` CLI command. Several minor bug fixes related to mlx model display and scheduling were also implemented.

v0.16.22 fixes3 features
Feb 14, 2026

This release introduces the ability to disable cloud models via a new setting or environment variable, and fixes rendering issues in PowerShell along with bugs affecting experimental image models.

v0.16.2-rc02 fixes2 features
Feb 14, 2026

This release introduces web search capabilities for Claude cloud models and adds an environment variable to easily disable cloud models for privacy. It also fixes rendering issues in PowerShell and restores functionality for experimental image generation models.

v0.16.12 fixes1 feature
Feb 12, 2026

This release improves the installation experience on macOS and Windows and adds support for respecting the OLLAMA_LOAD_TIMEOUT variable for image generation models.

v0.16.0-rc11 fix11 features
Feb 12, 2026

This release introduces significant UX improvements across the CLI and TUI, adds new features like external prompt editing and hidden login/logout aliases, and enhances model support with MLX integration and safetensors quantization.

v0.16.0-rc21 fix5 features
Feb 12, 2026

This release introduces significant improvements to the command-line interface (CLI) and Text User Interface (TUI) experience, adds MLX runner support with safetensors quantization, and includes new login/logout aliases.

v0.16.01 fix3 features
Feb 12, 2026

Ollama 0.16.0-rc2 introduces the powerful GLM-5 model and a new `ollama` command for simplified application launching. It also includes MLX runner improvements and a new keybinding for prompt editing.

v0.15.62 fixes1 feature
Feb 7, 2026

This release improves the launch experience by automatically downloading missing models and fixes context handling bugs for droid and claude commands.

v0.15.5-rc53 features
Feb 3, 2026

This release introduces two new models, GLM-OCR and Qwen3-Coder-Next, and enhances core functionality with sub-agent support and VRAM-aware context length defaulting.

v0.15.52 fixes8 features
Feb 3, 2026

This release introduces two new powerful models, GLM-OCR and Qwen3-Coder-Next, and significantly enhances `ollama launch` with argument passing and sub-agent support. It also implements VRAM-based dynamic context length setting.

v0.15.5-rc03 features
Feb 3, 2026

This release introduces the GLM-OCR model and updates default context sizes based on VRAM availability. It also adds support for GLM-4.7-Flash on the MLX engine.

v0.15.5-rc12 features
Feb 3, 2026

This release introduces the GLM-OCR model and adds support for GLM-4.7-Flash on the MLX engine, while also updating default context sizes based on VRAM.

v0.15.5-rc25 features
Feb 3, 2026

This release introduces two new models, GLM-OCR and Qwen3-Coder-Next, and enhances core functionality with sub-agent support and VRAM-aware context length defaulting.

v0.15.5-rc35 features
Feb 3, 2026

This release introduces two new models, GLM-OCR and Qwen3-Coder-Next, and enhances functionality with sub-agent support for `ollama launch` and VRAM-aware default context length settings.

v0.15.5-rc45 features
Feb 3, 2026

This release introduces two new models, GLM-OCR and Qwen3-Coder-Next, and enhances functionality with sub-agent support for launch commands and VRAM-aware context length defaulting.

v0.15.41 feature
Feb 1, 2026

This release updates the behavior of the `ollama launch openclaw` command to ensure the OpenClaw onboarding flow is executed if necessary.

v0.15.31 feature
Feb 1, 2026

This release renames the 'clawdbot' launch command to 'openclaw' and updates how `ollama launch` utilizes the OLLAMA_HOST environment variable. Tool calling for Ministral models has also been improved.

v0.15.21 feature
Jan 27, 2026

This release introduces a new command for easily launching Clawdbot integrated with Ollama models.

v0.15.11 fix1 feature
Jan 24, 2026

This release includes documentation updates, notably regarding 'ollama launch', and a performance fix by adding -O3 optimization to CGO flags.

v0.15.1-rc01 fix1 feature
Jan 24, 2026

This release includes documentation updates, notably regarding 'ollama launch', and a performance fix by adding -O3 optimization to CGO flags.

v0.15.01 fix2 features
Jan 21, 2026

This release introduces the `ollama config` command and adds image editing capabilities to x/imagegen. It also includes improvements to the CLI handling of model loading and output rendering.

v0.15.0-rc31 fix2 features
Jan 21, 2026

This release introduces the `ollama config` command and adds image editing capabilities to x/imagegen. It also includes improvements to CLI handling during model loading.

v0.15.0-rc1
Jan 21, 2026

This release focuses on internal cleanup of the manifest and model paths, and temporarily removes the qwen_image and qwen_image_edit models from x/imagegen.

v0.15.0-rc0
Jan 21, 2026

This release focuses on internal cleanup, specifically refining the manifest and modelpath handling, and removing the qwen_image and qwen_image_edit models from x/imagegen.

v0.14.3-rc11 feature
Jan 16, 2026

This release introduces an enhancement for the macOS application, allowing it to terminate gracefully during system shutdown.

v0.14.35 fixes5 features
Jan 16, 2026

This release introduces several new powerful image generation and LLM models, including Z-Image Turbo and GLM-4.7-Flash. It also includes several bug fixes related to macOS shutdown, model management, and API usage.

v0.14.3-rc35 fixes2 features
Jan 16, 2026

This release introduces the GLM-4.7-Flash model and enables image generation via the /api/generate endpoint, alongside several stability and command fixes.

v0.14.3-rc01 feature
Jan 16, 2026

This release introduces an enhancement for the macOS application to handle system shutdown more gracefully. It primarily focuses on improving application termination behavior on macOS.

v0.14.3-rc25 fixes2 features
Jan 16, 2026

This release introduces the GLM-4.7-Flash model and enables image generation via the /api/generate endpoint, alongside several stability and command fixes.

v0.14.2-rc11 feature
Jan 16, 2026

This release focuses on documentation updates, including integrations for Onyx and Marimo, and introduces multi-line input support in the CLI.

v0.14.21 feature
Jan 16, 2026

This release focuses on documentation updates, including new integrations for Onyx and Marimo, and introduces multi-line input support in the CLI.

v0.14.11 fix
Jan 14, 2026

This patch release addresses a critical bug affecting macOS auto-updates by fixing signature verification failures. It also welcomes two new contributors.

v0.14.04 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models and enhances API compatibility with Anthropic's message format. It also includes stability improvements for VRAM estimation and introduces the `REQUIRES` command for Modelfiles.

v0.14.0-rc2
Jan 10, 2026
v0.14.0-rc33 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models via MLX and enhances Anthropic API compatibility. It also adds model version requirements via the Modelfile and improves VRAM measurement accuracy.

v0.14.0-rc44 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models, adds Anthropic API compatibility, and includes several stability improvements related to VRAM estimation and error handling.

v0.14.0-rc74 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models via MLX, adds Anthropic API compatibility, and includes several stability improvements related to VRAM handling and error reporting.

v0.14.0-rc84 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models, adds Anthropic API compatibility, and improves VRAM estimation accuracy. It also includes a new `REQUIRES` command for Modelfiles to specify Ollama version requirements.

v0.14.0-rc94 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models, adds Anthropic API compatibility, and includes several stability improvements related to VRAM estimation and error handling.

v0.14.0-rc103 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models via MLX and enhances Anthropic API compatibility. It also adds the `REQUIRES` command to Modelfiles for version declaration and improves VRAM estimation accuracy.

v0.14.0-rc114 fixes5 features
Jan 10, 2026

This release introduces experimental support for image generation models, adds Anthropic API compatibility, and includes several stability improvements related to VRAM estimation and error handling.

v0.13.51 fix3 features
Dec 18, 2025

This release introduces support for Google's FunctionGemma model and migrates BERT architecture models to the Ollama engine. It also improves tool parsing for DeepSeek-V3.1 and fixes bugs related to nested tool properties.

v0.13.42 fixes3 features
Dec 13, 2025

This release introduces support for Nemotron 3 Nano and Olmo 3 models, enables Flash Attention by default, and provides critical fixes for Gemma 3 model architectures.

v0.13.31 fix5 features
Dec 9, 2025

This release introduces support for Devstral-Small-2, rnj-1, and nomic-embed-text-v2 models, while improving embedding truncation logic and fixing image input issues for qwen2.5vl.

v0.13.22 fixes2 features
Dec 4, 2025

This release introduces support for the Qwen3-Next model series and enables Flash Attention by default for vision models. It also includes critical fixes for multi-GPU CUDA detection and DeepSeek-v3.1 thinking behavior.

v0.13.14 fixes6 features
Nov 27, 2025

This release introduces support for Ministral-3 and Mistral-Large-3 models, adds tool calling for cogito-v2.1, and includes several fixes for CUDA detection and error reporting.

v0.13.02 fixes7 features
Nov 19, 2025

This release introduces support for DeepSeek-OCR, Cogito-V2.1, and DeepSeek-V3.1 architecture, alongside a new performance benchmarking tool and significant engine optimizations for KV caching and GPU detection.

v0.12.113 fixes6 features
Nov 12, 2025

Ollama 0.12.11 introduces logprobs support for API responses and adds opt-in Vulkan acceleration for expanded GPU compatibility.

v0.12.103 fixes5 features
Nov 5, 2025

This release enables embedding model support via the CLI, adds tool call IDs to the chat API, and improves Vulkan performance and hardware detection.

v0.12.91 fix
Oct 31, 2025

This release addresses a performance regression specifically impacting users running Ollama on CPU-only hardware.

v0.12.84 fixes3 features
Oct 30, 2025

This release focuses on performance optimizations for qwen3-vl, including default Flash Attention support, and fixes several issues related to model thinking modes and image processing.

v0.12.77 fixes8 features
Oct 29, 2025

This release introduces support for Qwen3-VL and MiniMax-M2 models, adds file attachments and thinking level adjustments to the app, and provides updated API documentation alongside several embedding and backend bug fixes.

v0.12.65 fixes3 features
Oct 15, 2025

This release introduces search support for tool-calling models, enables Flash Attention for Gemma 3, and adds experimental Vulkan support for broader GPU compatibility alongside several model-specific bug fixes.

v0.12.5Breaking2 fixes2 features
Oct 10, 2025

This release introduces structured output support for thinking models and improves app startup behavior, while removing support for older macOS versions and specific AMD GPU architectures.

v0.12.4Breaking5 fixes3 features
Oct 3, 2025

This release enables Flash Attention by default for Qwen 3 models and improves VRAM detection, while dropping support for older macOS versions and specific AMD GPU architectures.

v0.12.33 fixes3 features
Sep 26, 2025

This release adds support for DeepSeek-V3.1-Terminus and Kimi-K2-Instruct-0905 models while fixing critical bugs related to tool call parsing, Unicode rendering, and model loading crashes.

v0.12.21 fix4 features
Sep 24, 2025

This release introduces a new Web Search API for real-time information retrieval and expands the new engine's capabilities to support Qwen3 architectures and multi-regex pretokenizers.

v0.12.14 fixes2 features
Sep 21, 2025

This release adds support for Qwen3 Embedding and tool calling for Qwen3-Coder, alongside several bug fixes for Gemma3 models, Linux sign-in, and function calling parsing.

v0.12.03 fixes3 features
Sep 18, 2025

This release introduces cloud models in preview, expanding hardware support for larger models, and adds native support for Bert and Qwen 3 architectures within Ollama's engine.

v0.11.11Breaking5 fixes6 features
Sep 11, 2025

This release adds CUDA 13 support, introduces a dimensions field for embeddings, and improves memory estimation and app UI. It also removes support for loading split vision models in the Ollama engine.

v0.11.101 feature
Sep 4, 2025

This release introduces support for the EmbeddingGemma model, providing a high-performance open embedding model for Ollama users.

v0.11.92 fixes1 feature
Sep 2, 2025

This release focuses on performance optimizations through CPU/GPU overlapping and stability fixes for AMD GPUs and Unix-based installations.

v0.11.82 features
Aug 27, 2025

This release enables flash attention by default for gpt-oss models and improves their overall loading performance.

v0.11.74 fixes4 features
Aug 25, 2025

This release introduces the DeepSeek-V3.1 model and the preview of Turbo mode for running large models. Several bugs related to model loading, thinking tags, and tool call parsing have also been resolved.

v0.11.62 fixes3 features
Aug 20, 2025

This release focuses on UI improvements for the Ollama app, including faster chat switching and better layouts, alongside performance optimizations for flash attention and BPE encoding.

v0.11.52 fixes6 features
Aug 15, 2025

This release introduces significant memory management improvements for GPU scheduling and multi-GPU setups, alongside performance optimizations for gpt-oss models and reduced installation sizes.

v0.11.41 fix2 features
Aug 7, 2025

This release improves OpenAI API compatibility by supporting simultaneous content and tool calls, ensuring tool name propagation, and consistently providing reasoning in responses.

v0.11.31 fix1 feature
Aug 6, 2025

This release fixes a VRAM leak in gpt-oss during multi-device execution and improves Windows stability by statically linking C++ libraries.

v0.11.22 fixes
Aug 5, 2025

This patch release focuses on stability improvements for gpt-oss, specifically fixing crashes related to KV cache quantization and a missing variable definition.

v0.11.01 fix8 features
Aug 5, 2025

Ollama v0.11 introduces support for OpenAI's gpt-oss models (20B and 120B) featuring native MXFP4 quantization, agentic capabilities, and configurable reasoning effort.

v0.10.12 fixes
Jul 31, 2025

This patch release focuses on bug fixes for international character input and log output accuracy.

v0.10.0Breaking3 fixes5 features
Jul 18, 2025

Ollama v0.10.0 introduces a new desktop app, significant performance optimizations for gemma3n and multi-GPU setups, and critical fixes for tool calling and API image support.

v0.9.61 fix1 feature
Jul 8, 2025

This release introduces the ability to specify tool names in chat messages and includes a UI fix for the launch screen.

v0.9.5Breaking2 fixes4 features
Jul 2, 2025

Ollama 0.9.5 introduces a native macOS app with faster startup, network exposure capabilities, and customizable model storage directories. It also raises the minimum macOS requirement to version 12.

v0.9.4Breaking3 fixes3 features
Jun 27, 2025

This release introduces network exposure and custom model directories, while significantly optimizing the macOS application as a native app with a smaller footprint. It also includes fixes for tool calling and Gemma 3n model quantization.

v0.9.31 fix2 features
Jun 25, 2025

This release adds support for the multilingual Gemma 3n model family and introduces automatic context length limiting to improve model stability.

v0.9.23 fixes
Jun 18, 2025

This patch release focuses on bug fixes for tool calling, generation errors, and tokenization issues across specific model architectures.

v0.9.13 fixes7 features
Jun 9, 2025

This release introduces tool calling for DeepSeek-R1 and Magistral, alongside a major preview of native macOS and Windows applications featuring network exposure and custom model directories.

v0.9.05 features
May 29, 2025

Ollama v0.9.0 introduces 'thinking' support, allowing models like DeepSeek R1 and Qwen 3 to separate reasoning from output via a new API field and CLI toggles.

v0.8.02 features
May 27, 2025

Ollama v0.8.0 introduces streaming support for tool calls and improves engine logging with more detailed memory estimation data.

v0.7.12 fixes4 features
May 21, 2025

This release introduces support for Qwen 3 and Qwen 2 architectures while providing critical stability fixes for multimodal models and memory management.

v0.7.0Breaking6 fixes5 features
May 13, 2025

Ollama v0.7.0 introduces a new multimodal engine supporting vision models like Llama 4 and Gemma 3, along with WebP support and various performance improvements and bug fixes.

v0.6.83 fixes3 features
May 3, 2025

This release focuses on performance optimizations for Qwen 3 MoE models and critical stability fixes, including memory leak resolutions and improved OOM handling.

v0.6.74 fixes4 features
Apr 26, 2025

Ollama v0.6.7 introduces support for Llama 4, Qwen 3, and Phi 4 reasoning models while increasing the default context window and fixing several inference and path-handling bugs.

v0.6.64 fixes7 features
Apr 17, 2025

This release adds support for IBM Granite 3.3 and DeepCoder models, introduces an experimental high-performance downloader, and fixes critical memory leaks for Gemma 3 and Mistral Small 3.1.

v0.6.52 features
Apr 6, 2025

This release introduces support for the Mistral Small 3.1 vision model and optimizes loading performance for Gemma 3 on network-backed filesystems.

v0.6.44 fixes2 features
Apr 2, 2025

Ollama v0.6.4 focuses on stability improvements for Gemma 3 and DeepSeek models, adds vision capability metadata to the API, and introduces AMD RDNA4 support for Linux users.

v0.6.32 fixes5 features
Mar 22, 2025

This release introduces performance optimizations and improved loading for Gemma 3, alongside critical bug fixes for model execution errors and enhancements to the CLI tools.

v0.6.22 fixes3 features
Mar 18, 2025

This release introduces multi-image support and memory optimizations for Gemma 3, adds support for AMD Strix Halo GPUs, and fixes issues with model quantization and saving.

v0.6.12 fixes4 features
Mar 14, 2025

This release introduces support for the Command A 111B model, improves memory management for gemma3, and adds new CLI features including verbose model information and navigation hotkeys.

v0.6.01 fix1 feature
Mar 11, 2025

This release introduces support for Google's Gemma 3 model family across various sizes and resolves execution errors for Snowflake Arctic embedding models.

Common Errors

Related AI & LLMs Packages

Subscribe to Updates

Get notified when new versions are released

RSS Feed