v0.20.4-rc1

📅 Apr 7, 2026📦 ollamaView on GitHub →

✨ 2 features🐛 2 fixes🔧 3 symbols

Summary

This release focuses on performance improvements for MLX (M5 with NAX) and Gemma4 (flash attention), alongside fixes for model creation paths and safetensor loading.

✨ New Features

Improve M5 performance with NAX for MLX backend.
Enable flash attention for Gemma4 models.

🐛 Bug Fixes

Clean up experimental paths during model creation.
Fix model creation when starting from an existing safetensor model.

Affected Symbols

mlx gemma4 create