v0.11.11

Breaking Changes

📅 Sep 11, 2025📦 ollama

⚠ 1 breaking✨ 6 features🐛 5 fixes🔧 6 symbols

Summary

This release adds CUDA 13 support, introduces a dimensions field for embeddings, and improves memory estimation and app UI. It also removes support for loading split vision models in the Ollama engine.

⚠️ Breaking Changes

Split vision models are no longer supported in the Ollama engine.

Migration Steps

If using split vision models, transition to supported model formats as they will no longer load in the Ollama engine.

✨ New Features

Added support for CUDA 13.
Added 'dimensions' field to embedding requests.
Added zoom and shrink functionality (Cmd +/-) in the Ollama app.
Added ability to copy assistant messages in the Ollama app.
Enabled new memory estimates in the Ollama engine by default.
Improved scrolling performance in the Ollama app for long prompts.

🐛 Bug Fixes

Fixed error when importing safetensor files.
Fixed error occurring when batch size exceeded context length.
Fixed validation issues for Flash Attention and KV cache quantization.
Improved memory usage for gpt-oss in the Ollama app.
Improved memory estimates for hybrid and recurrent models.

🔧 Affected Symbols

embeddimensionssafetensorFlashAttentionKVCachegpt-oss