v0.15.5

📅 Feb 3, 2026📦 ollamaView on GitHub →

✨ 8 features🐛 2 fixes🔧 4 symbols

Summary

This release introduces two new powerful models, GLM-OCR and Qwen3-Coder-Next, and significantly enhances `ollama launch` with argument passing and sub-agent support. It also implements VRAM-based dynamic context length setting.

✨ New Features

Introduction of GLM-OCR, a multimodal OCR model for complex document understanding.
Introduction of Qwen3-Coder-Next, a coding-focused language model optimized for agentic coding workflows.
The `ollama launch` command now accepts arguments, e.g., `ollama launch claude -- --resume`.
Sub-agent support is now enabled for `ollama launch` for tasks like planning and deep research.
Ollama now automatically sets context limits for certain models when using `ollama launch opencode`.
Ollama now defaults context lengths based on available VRAM: 4k (< 24 GiB), 32k (24-48 GiB), and 262k (>= 48 GiB).
GLM-4.7-Flash model is now supported on Ollama's experimental MLX engine.
`ollama signin` now opens a browser window to facilitate the sign-in process.

🐛 Bug Fixes

Fixed an off-by-one error when utilizing `num_predict` in the API.
Resolved an issue where tokens from a preceding sequence were incorrectly returned when `num_predict` was reached.

Affected Symbols

ollama launch ollama signin num_predict GLM-4.7-Flash