Change8

v0.7.0

Breaking Changes
📦 ollamaView on GitHub →
1 breaking5 features🐛 6 fixes🔧 6 symbols

Summary

Ollama v0.7.0 introduces a new multimodal engine supporting vision models like Llama 4 and Gemma 3, along with WebP support and various performance improvements and bug fixes.

⚠️ Breaking Changes

  • API response code changed from 404 to 405 for methods that are not allowed.

Migration Steps

  1. Update Ollama to v0.7.0 to access the new multimodal engine.
  2. If using the API, ensure error handling accounts for 405 Method Not Allowed instead of 404 for unsupported methods.

✨ New Features

  • Support for multimodal models via a new engine.
  • Support for Meta Llama 4, Google Gemma 3, Qwen 2.5 VL, and Mistral Small 3.1.
  • Support for WebP images as input to multimodal models.
  • Improved performance of importing safetensors models via 'ollama create'.
  • Improved prompt processing speeds for Qwen3 MoE on macOS.

🐛 Bug Fixes

  • Fixed blank terminal window appearing on Windows when running models.
  • Fixed error when running llama4 on NVIDIA GPUs.
  • Reduced log level of 'key not found' message.
  • Ollama now correctly removes quotes from image paths in 'ollama run'.
  • Fixed error when providing large JSON schemas in structured output requests.
  • Fixed issue where processes would continue to run after a model was unloaded.

🔧 Affected Symbols

ollama runollama createllama4Qwen3 MoEstructured outputAPI