v0.20.4
📦 ollamaView on GitHub →
✨ 2 features🔧 2 symbols
Summary
This release focuses on performance improvements for M5 models via NAX integration and enables flash attention support for gemma4.
✨ New Features
- Improve M5 performance with NAX in mlx backend.
- Enable flash attention for gemma4 models.