Change8

v0.13.2

📦 ollamaView on GitHub →
2 features🐛 2 fixes🔧 5 symbols

Summary

This release introduces support for the Qwen3-Next model series and enables Flash Attention by default for vision models. It also includes critical fixes for multi-GPU CUDA detection and DeepSeek-v3.1 thinking behavior.

✨ New Features

  • Added support for Qwen3-Next model series.
  • Flash attention is now enabled by default for vision models (mistral-3, gemma3, qwen3-vl, etc.) to improve memory utilization and performance.

🐛 Bug Fixes

  • Fixed GPU detection on multi-GPU CUDA machines.
  • Fixed issue where deepseek-v3.1 would continue thinking even when the feature was disabled in the app.

🔧 Affected Symbols

mistral-3gemma3qwen3-vldeepseek-v3.1CUDA GPU detection