Change8

b8963

📦 llama-cppView on GitHub →
🐛 1 fixes🔧 3 symbols

Summary

This release introduces a change in the Vulkan backend to coalesce Q4_K/Q5_K scale loads by forcing a full 12-byte load and extracting bits, resolving compilation and performance issues observed with certain SPIR-V compilers like mesa.

🐛 Bug Fixes

  • Coalesced Q4_K/Q5_K scale loads in Vulkan backend to improve performance and fix issues with certain SPIR-V compilers (like mesa) not optimizing conditional loads correctly in matmul operations.

Affected Symbols