Change8

b9144

📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols

Summary

This release includes a targeted optimization for ggml-webgpu performance based on head dimension divisibility and provides updated binaries across numerous platforms including macOS, Linux, Android, Windows, and openEuler.

🐛 Bug Fixes

  • ggml-webgpu: Only use subgroup-matrix path when head dimensions are divisible by sg_mat_k / sg_mat_n.

Affected Symbols