v0.20.4-rc1
📦 ollamaView on GitHub →
✨ 2 features🐛 2 fixes🔧 3 symbols
Summary
This release focuses on performance improvements for MLX (M5 with NAX) and Gemma4 (flash attention), alongside fixes for model creation paths and safetensor loading.
✨ New Features
- Improve M5 performance with NAX for MLX backend.
- Enable flash attention for Gemma4 models.
🐛 Bug Fixes
- Clean up experimental paths during model creation.
- Fix model creation when starting from an existing safetensor model.