v0.12.0

📅 Sep 18, 2025📦 ollamaView on GitHub →

✨ 3 features🐛 3 fixes

Summary

This release introduces cloud models in preview, expanding hardware support for larger models, and adds native support for Bert and Qwen 3 architectures within Ollama's engine.

Migration Steps

To run a cloud model, use the command: 'ollama run qwen3-coder:480b-cloud'.

✨ New Features

Cloud models are now available in preview, enabling the execution of larger models using fast, datacenter-grade hardware.
Models utilizing the Bert architecture can now run on Ollama's engine.
Models utilizing the Qwen 3 architecture can now run on Ollama's engine.

🐛 Bug Fixes

Resolved an issue where older NVIDIA GPUs were not detected when newer drivers were installed.
Fixed an issue preventing correct model importation when using 'ollama create'.
Ollama now skips parsing the initial '<think>' tag if present in the prompt for the /api/generate endpoint.