v0.13.2

📅 Dec 4, 2025📦 ollamaView on GitHub →

✨ 2 features🐛 2 fixes🔧 5 symbols

Summary

This release introduces support for the Qwen3-Next model series and enables Flash Attention by default for vision models. It also includes critical fixes for multi-GPU CUDA detection and DeepSeek-v3.1 thinking behavior.

✨ New Features

Added support for Qwen3-Next model series.
Flash attention is now enabled by default for vision models (mistral-3, gemma3, qwen3-vl, etc.) to improve memory utilization and performance.

🐛 Bug Fixes

Fixed GPU detection on multi-GPU CUDA machines.
Fixed issue where deepseek-v3.1 would continue thinking even when the feature was disabled in the app.

🔧 Affected Symbols

mistral-3gemma3qwen3-vldeepseek-v3.1CUDA GPU detection