Change8

b9847

📦 llama-cppView on GitHub →
🐛 2 fixes🔧 1 symbols

Summary

This release addresses a critical bug fix for Gemma E4B MTP FlashAttention on CUDA and cleans up unused template declarations. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.

🐛 Bug Fixes

  • Fixed Gemma E4B MTP FlashAttention on CUDA.
  • Removed unused template declaration in CUDA implementation.

Affected Symbols