v0.30.11-rc0

📅 Jun 25, 2026📦 ollamaView on GitHub →

✨ 5 features🐛 11 fixes🔧 6 symbols

Summary

This release introduces new auto-installation features for models like Claude Code and opencode, alongside numerous stability and performance improvements across Vulkan, MLX, and model loading mechanisms. Key fixes include correcting GPU classification on Windows and improving speculative decoding.

Migration Steps

Update llama.cpp dependency to the latest version included in this release.

✨ New Features

Added thinking capability detection to opencode via launch.
Auto-installation of Claude Code model via launch.
Auto-installation of opencode when missing via launch.
Default Qwen2.5VL window attention metadata.
Redesigned documentation landing and integrations overview.

🐛 Bug Fixes

Fixed inverted iGPU/dGPU Vulkan classification on Windows hybrid graphics.
Unified and tuned speculative decoding in mlxrunner.
Detected model drift when Codex App UI switches.
Added sm_86 architecture to cuda_v13_windows preset.
Sized mmproj offload by projector memory.
Preserved generation headroom for shifted prompts.
Used host Vulkan loader on Windows.
Fixed CUDA JIT packaging in mlx.
Fixed ollama ps double-counting mmap'd weights on partial offload.
Aligned server generate endpoint with native chat templates.
Added CC 87 support for CUDA v13 on Jetson.

Affected Symbols

opencode Claude Code Codex App UI Qwen2.5VL ollama ps generate endpoint