b9106
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
This release introduces support for asymmetric fused attention in Vulkan paths across scalar, mmq, and coopmat1 operations. It also provides a comprehensive set of pre-built binaries for numerous platforms and hardware configurations.
✨ New Features
- Vulkan backend now supports asymmetric fused attention (FA) in scalar, mixed-precision quantization (mmq), and cooperative matrix (coopmat1) computation paths.