b9144

📅 May 14, 2026📦 llama-cppView on GitHub →

🐛 1 fixes🔧 1 symbols

Summary

This release includes a targeted optimization for ggml-webgpu performance based on head dimension divisibility and provides updated binaries across numerous platforms including macOS, Linux, Android, Windows, and openEuler.

🐛 Bug Fixes

ggml-webgpu: Only use subgroup-matrix path when head dimensions are divisible by sg_mat_k / sg_mat_n.

Affected Symbols

ggml-webgpu