b9144
📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols
Summary
This release includes a targeted optimization for ggml-webgpu performance based on head dimension divisibility and provides updated binaries across numerous platforms including macOS, Linux, Android, Windows, and openEuler.
🐛 Bug Fixes
- ggml-webgpu: Only use subgroup-matrix path when head dimensions are divisible by sg_mat_k / sg_mat_n.