Change8

v0.19.0

📦 ollamaView on GitHub →
4 features🐛 4 fixes🔧 3 symbols

Summary

This release introduces improvements to KV cache handling, adds a web search plugin to `ollama launch pi`, and resolves several model loading and parsing bugs across different architectures.

✨ New Features

  • Ollama's app no longer incorrectly shows "model is out of date".
  • The command `ollama launch pi` now includes a web search plugin utilizing Ollama's web search capabilities.
  • Improved KV cache hit rate when utilizing the Anthropic-compatible API.
  • MLX runner now creates periodic snapshots during prompt processing.

🐛 Bug Fixes

  • Fixed tool call parsing issue with Qwen3.5 where tool calls were incorrectly output during thinking.
  • Fixed KV cache snapshot memory leak in the MLX runner.
  • Fixed issue where flash attention was incorrectly enabled for `grok` models.
  • Fixed issue preventing `qwen3-next:80b` from loading in Ollama.

Affected Symbols