b9113

📅 May 11, 2026📦 llama-cppView on GitHub →

✨ 1 features🐛 2 fixes🔧 1 symbols

Summary

This release introduces Q4_1 MoE support for OpenCL on Adreno GPUs and includes cleanup of OpenCL code by removing unnecessary asserts and code.

✨ New Features

Added support for Q4_1 MoE (Mixture of Experts) quantization on OpenCL devices, specifically for Adreno GPUs.

🐛 Bug Fixes

Fixed the OpenCL supports_op check for Q4_1 MoE to correctly identify supported shapes on Adreno.
Removed unnecessary asserts and code within the OpenCL implementation.

Affected Symbols