b9244
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
This release introduces OpenCL support for MoE models using q4_k, q5_k, and q6_k quantization on Adreno GPUs and provides updated binaries across multiple operating systems and architectures.
✨ New Features
- Added support for Mixture of Experts (MoE) quantization formats q4_k, q5_k, and q6_k on OpenCL for Adreno GPUs.