b8853

📅 Apr 20, 2026📦 llama-cppView on GitHub →

🐛 2 fixes🔧 2 symbols

Summary

This release fixes a critical assertion failure in SYCL kernels related to vocabulary size alignment by implementing padding logic, and clarifies subgroup size usage in MMVQ launches.

🐛 Bug Fixes

Fixed an assertion failure in SYCL reorder MMVQ dispatchers (Q4_0, Q8_0, Q4_K, Q6_K) that occurred when models had a vocabulary size not divisible by 16, by replacing the assert with padding logic.
Replaced the hardcoded subgroup size of 16 with WARP_SIZE in the four reorder_mul_mat_vec launch helpers for SYCL kernels (Q4_0, Q8_0, Q4_K, Q6_K) to make the dependency on subgroup size explicit.

Affected Symbols

reorder mul_mat_vec dispatchers (Q4_0, Q8_0, Q4_K, Q6_K)reorder_mul_mat_vec launch helpers