v0.12.10
📦 ollama
✨ 5 features🐛 3 fixes🔧 7 symbols
Summary
This release enables embedding model support via the CLI, adds tool call IDs to the chat API, and improves Vulkan performance and hardware detection.
✨ New Features
- ollama run now supports embedding models for generating vector embeddings from text or stdin.
- The /api/chat API now returns tool call IDs.
- Flash attention support enabled for Vulkan (requires building from source).
- Added Vulkan memory detection for Intel GPUs using DXGI+PDH.
- Interactive mode now displays login instructions when switching to cloud models.
🐛 Bug Fixes
- Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct models.
- Fixed application hanging issues caused by CPU discovery.
- Fixed issues with reading stale VRAM data.
🔧 Affected Symbols
ollama run/api/chatqwen3-vl:235bqwen3-vl:235b-instructVulkanDXGIPDH