v0.12.10

📅 Nov 5, 2025📦 ollama

✨ 5 features🐛 3 fixes🔧 7 symbols

Summary

This release enables embedding model support via the CLI, adds tool call IDs to the chat API, and improves Vulkan performance and hardware detection.

✨ New Features

ollama run now supports embedding models for generating vector embeddings from text or stdin.
The /api/chat API now returns tool call IDs.
Flash attention support enabled for Vulkan (requires building from source).
Added Vulkan memory detection for Intel GPUs using DXGI+PDH.
Interactive mode now displays login instructions when switching to cloud models.

🐛 Bug Fixes

Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct models.
Fixed application hanging issues caused by CPU discovery.
Fixed issues with reading stale VRAM data.

🔧 Affected Symbols

ollama run/api/chatqwen3-vl:235bqwen3-vl:235b-instructVulkanDXGIPDH