Change8

b8853

📦 llama-cppView on GitHub →
🐛 2 fixes🔧 2 symbols

Summary

This release fixes a critical assertion failure in SYCL kernels related to vocabulary size alignment by implementing padding logic, and clarifies subgroup size usage in MMVQ launches.

🐛 Bug Fixes

  • Fixed an assertion failure in SYCL reorder MMVQ dispatchers (Q4_0, Q8_0, Q4_K, Q6_K) that occurred when models had a vocabulary size not divisible by 16, by replacing the assert with padding logic.
  • Replaced the hardcoded subgroup size of 16 with WARP_SIZE in the four reorder_mul_mat_vec launch helpers for SYCL kernels (Q4_0, Q8_0, Q4_K, Q6_K) to make the dependency on subgroup size explicit.

Affected Symbols