b8168

📅 Feb 27, 2026📦 llama-cppView on GitHub →

🐛 1 fixes

Summary

This release primarily addresses a bug concerning fp16 Flash Attention performance on specific Windows AMD hardware configurations using Vulkan. It also provides a comprehensive set of updated pre-built binaries across multiple operating systems and hardware targets.

🐛 Bug Fixes

Fixed fp16 Flash Attention functionality on Windows systems using AMD RDNA2 GPUs or older architectures via Vulkan backend.