b8935
📦 llama-cppView on GitHub →
✨ 3 features🔧 1 symbols
Summary
This release introduces support for the iq4_nl quantization format on the OpenCL backend, including specific optimizations for Adreno GPUs. It also provides a comprehensive set of pre-built binaries across multiple operating systems and hardware configurations.
✨ New Features
- Added general support for iq4_nl quantization format on OpenCL.
- Added iq4_nl GEMM/GEMV kernels for Adreno GPUs via OpenCL.
- OpenCL implementation now packs 2 LUT entries into a single uint.