b8347
📦 llama-cppView on GitHub →
🐛 3 fixes🔧 2 symbols
Summary
This release focuses on Q4_0 and MXFP4 repack fixes within the hexagon backend, optimizing matrix multiplication kernels and resolving tail corruption issues.
🐛 Bug Fixes
- Fixed tail corruption in hexagon when row sizes are not a multiple of 256.
- Updated hex-mm kernels to avoid shuffles for full 256-element blocks by using even:odd packing only for the last block.
- Tightened supported MUL_MAT checks in hex-mm to prevent spurious failures.