Change8

v0.12.10

📦 ollama
5 features🐛 3 fixes🔧 7 symbols

Summary

This release enables embedding model support via the CLI, adds tool call IDs to the chat API, and improves Vulkan performance and hardware detection.

✨ New Features

  • ollama run now supports embedding models for generating vector embeddings from text or stdin.
  • The /api/chat API now returns tool call IDs.
  • Flash attention support enabled for Vulkan (requires building from source).
  • Added Vulkan memory detection for Intel GPUs using DXGI+PDH.
  • Interactive mode now displays login instructions when switching to cloud models.

🐛 Bug Fixes

  • Fixed errors when running qwen3-vl:235b and qwen3-vl:235b-instruct models.
  • Fixed application hanging issues caused by CPU discovery.
  • Fixed issues with reading stale VRAM data.

🔧 Affected Symbols

ollama run/api/chatqwen3-vl:235bqwen3-vl:235b-instructVulkanDXGIPDH