b9006

📅 May 2, 2026📦 llama-cppView on GitHub →

✨ 1 features🐛 4 fixes🔧 2 symbols

Summary

This release introduces OpenCL optimizations for MoE Mxfp4 on Adreno GPUs, including a new CLC kernel and router reordering, alongside various cleanup and bug fixes.

Migration Steps

Removed 'putenv' call in llama-model.cpp.

✨ New Features

Added MoE Mxfp4 CLC kernel and router reorder on GPU for Adreno optimization in OpenCL.

🐛 Bug Fixes

Fixed precision issue in OpenCL path.
Removed unnecessary headers in OpenCL.
Removed unnecessary assert in OpenCL.
Stopped saving cl_program objects in OpenCL.

Affected Symbols

llama-model.cpp opencl