b8168
📦 llama-cppView on GitHub →
🐛 1 fixes
Summary
This release primarily addresses a bug concerning fp16 Flash Attention performance on specific Windows AMD hardware configurations using Vulkan. It also provides a comprehensive set of updated pre-built binaries across multiple operating systems and hardware targets.
🐛 Bug Fixes
- Fixed fp16 Flash Attention functionality on Windows systems using AMD RDNA2 GPUs or older architectures via Vulkan backend.