b9847
📦 llama-cppView on GitHub →
🐛 2 fixes🔧 1 symbols
Summary
This release addresses a critical bug fix for Gemma E4B MTP FlashAttention on CUDA and cleans up unused template declarations. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.
🐛 Bug Fixes
- Fixed Gemma E4B MTP FlashAttention on CUDA.
- Removed unused template declaration in CUDA implementation.