Change8

b9244

📦 llama-cppView on GitHub →
1 features🔧 1 symbols

Summary

This release introduces OpenCL support for MoE models using q4_k, q5_k, and q6_k quantization on Adreno GPUs and provides updated binaries across multiple operating systems and architectures.

✨ New Features

  • Added support for Mixture of Experts (MoE) quantization formats q4_k, q5_k, and q6_k on OpenCL for Adreno GPUs.

Affected Symbols