Change8

b8493

📦 llama-cppView on GitHub →
4 features🐛 8 fixes🔧 1 symbols

Summary

This release introduces significant enhancements to OpenCL support, adding q6_K gemm and gemv kernels for Adreno devices, alongside numerous bug fixes and refactoring within the OpenCL backend.

✨ New Features

  • Added q6_K gemm and gemv kernels for OpenCL Adreno support.
  • Added q6_K noshuffle kernels and initial q6_K gemv implementation for OpenCL.
  • Added q6_K transpose support for OpenCL.
  • Added gemm_noshuffle_q6_k_f32 kernel.

🐛 Bug Fixes

  • Fixed cvt kernel name in OpenCL.
  • Fixed q6_K scale transpose issue in OpenCL.
  • Fixed loading for gemv q6_K and refactored related code.
  • Fixed transpose_8_buf kernel assignment and refactored code.
  • Fixed qh loading.
  • Fixed q6_K dequant and scale selection.
  • Worked around a compiler bug and fixed dump_tensor.
  • Fixed OpenCL handling for non-uniform workgroups.

Affected Symbols