b8493
📦 llama-cppView on GitHub →
✨ 4 features🐛 8 fixes🔧 1 symbols
Summary
This release introduces significant enhancements to OpenCL support, adding q6_K gemm and gemv kernels for Adreno devices, alongside numerous bug fixes and refactoring within the OpenCL backend.
✨ New Features
- Added q6_K gemm and gemv kernels for OpenCL Adreno support.
- Added q6_K noshuffle kernels and initial q6_K gemv implementation for OpenCL.
- Added q6_K transpose support for OpenCL.
- Added gemm_noshuffle_q6_k_f32 kernel.
🐛 Bug Fixes
- Fixed cvt kernel name in OpenCL.
- Fixed q6_K scale transpose issue in OpenCL.
- Fixed loading for gemv q6_K and refactored related code.
- Fixed transpose_8_buf kernel assignment and refactored code.
- Fixed qh loading.
- Fixed q6_K dequant and scale selection.
- Worked around a compiler bug and fixed dump_tensor.
- Fixed OpenCL handling for non-uniform workgroups.