b8963

📅 Apr 28, 2026📦 llama-cppView on GitHub →

🐛 1 fixes🔧 3 symbols

Summary

This release introduces a change in the Vulkan backend to coalesce Q4_K/Q5_K scale loads by forcing a full 12-byte load and extracting bits, resolving compilation and performance issues observed with certain SPIR-V compilers like mesa.

🐛 Bug Fixes

Coalesced Q4_K/Q5_K scale loads in Vulkan backend to improve performance and fix issues with certain SPIR-V compilers (like mesa) not optimizing conditional loads correctly in matmul operations.

Affected Symbols

vulkan Q4_K/Q5_K scale loads mul_mat mul_mat_vecq