v0.12.0
📦 ollamaView on GitHub →
✨ 3 features🐛 3 fixes
Summary
This release introduces cloud models in preview, expanding hardware support for larger models, and adds native support for Bert and Qwen 3 architectures within Ollama's engine.
Migration Steps
- To run a cloud model, use the command: 'ollama run qwen3-coder:480b-cloud'.
✨ New Features
- Cloud models are now available in preview, enabling the execution of larger models using fast, datacenter-grade hardware.
- Models utilizing the Bert architecture can now run on Ollama's engine.
- Models utilizing the Qwen 3 architecture can now run on Ollama's engine.
🐛 Bug Fixes
- Resolved an issue where older NVIDIA GPUs were not detected when newer drivers were installed.
- Fixed an issue preventing correct model importation when using 'ollama create'.
- Ollama now skips parsing the initial '<think>' tag if present in the prompt for the /api/generate endpoint.