b8971

📅 Apr 29, 2026📦 llama-cppView on GitHub →

🐛 2 fixes🔧 1 symbols

Summary

This release primarily focuses on bug fixes within ggml-webgpu related to FlashAttention support and kv_tile fitting. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.

🐛 Bug Fixes

Fixed FlashAttention support check in ggml-webgpu for devices lacking subgroup support.
Set path to none if kv_tile does not fit in ggml-webgpu.

Affected Symbols

ggml-webgpu