v0.15.5
📦 ollamaView on GitHub →
✨ 8 features🐛 2 fixes🔧 4 symbols
Summary
This release introduces two new powerful models, GLM-OCR and Qwen3-Coder-Next, and significantly enhances `ollama launch` with argument passing and sub-agent support. It also implements VRAM-based dynamic context length setting.
✨ New Features
- Introduction of GLM-OCR, a multimodal OCR model for complex document understanding.
- Introduction of Qwen3-Coder-Next, a coding-focused language model optimized for agentic coding workflows.
- The `ollama launch` command now accepts arguments, e.g., `ollama launch claude -- --resume`.
- Sub-agent support is now enabled for `ollama launch` for tasks like planning and deep research.
- Ollama now automatically sets context limits for certain models when using `ollama launch opencode`.
- Ollama now defaults context lengths based on available VRAM: 4k (< 24 GiB), 32k (24-48 GiB), and 262k (>= 48 GiB).
- GLM-4.7-Flash model is now supported on Ollama's experimental MLX engine.
- `ollama signin` now opens a browser window to facilitate the sign-in process.
🐛 Bug Fixes
- Fixed an off-by-one error when utilizing `num_predict` in the API.
- Resolved an issue where tokens from a preceding sequence were incorrectly returned when `num_predict` was reached.