v0.19.0
📦 ollamaView on GitHub →
✨ 4 features🐛 4 fixes🔧 3 symbols
Summary
This release introduces improvements to KV cache handling, adds a web search plugin to `ollama launch pi`, and resolves several model loading and parsing bugs across different architectures.
✨ New Features
- Ollama's app no longer incorrectly shows "model is out of date".
- The command `ollama launch pi` now includes a web search plugin utilizing Ollama's web search capabilities.
- Improved KV cache hit rate when utilizing the Anthropic-compatible API.
- MLX runner now creates periodic snapshots during prompt processing.
🐛 Bug Fixes
- Fixed tool call parsing issue with Qwen3.5 where tool calls were incorrectly output during thinking.
- Fixed KV cache snapshot memory leak in the MLX runner.
- Fixed issue where flash attention was incorrectly enabled for `grok` models.
- Fixed issue preventing `qwen3-next:80b` from loading in Ollama.