b8714

📅 Apr 9, 2026📦 llama-cppView on GitHub →

🐛 1 fixes🔧 1 symbols

Summary

This release enhances KV cache quantization checks to correctly account for enabled flash attention. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.

🐛 Bug Fixes

Extended KV cache quantization checks to include enabled flash attention, not just auto mode.

Affected Symbols

kv-cache