v0.30.11-rc1

📅 Jun 25, 2026📦 ollamaView on GitHub →

✨ 6 features🐛 8 fixes🔧 11 symbols

Summary

This release introduces new auto-installation features for models like Claude Code and opencode, alongside numerous stability and performance improvements across GPU handling (Vulkan, CUDA presets) and model loading/generation.

Migration Steps

Users on Windows with hybrid graphics should verify Vulkan device classification.
Users relying on specific mmproj offload behavior might need to re-evaluate memory settings due to projector memory sizing.

✨ New Features

Added thinking capability detection to opencode via launch command.
Enabled auto-installation of Claude Code via launch command.
Enabled auto-installation of opencode when missing via launch command.
Added sm_86 architecture to cuda_v13_windows preset for llama.
Defaulted qwen2.5vl window attention metadata.
Aligned server generate endpoint with native chat templates.

🐛 Bug Fixes

Fixed inverted iGPU/dGPU Vulkan classification on Windows hybrid graphics.
Unified and tuned speculative decoding in mlxrunner.
Detected model drift when Codex App UI switches.
Fixed sizing of mmproj offload by projector memory.
Preserved generation headroom for shifted prompts.
Fixed ollama ps double-counting mmap'd weights on partial offload.
Updated mlx and fixed CUDA JIT packaging.
Added CC 87 support for CUDA v13 on Jetson.

Affected Symbols

opencode Claude Code Vulkan mlxrunner Codex App UI llama mmproj ollama ps mlx server generate jetson