b9006
📦 llama-cppView on GitHub →
✨ 1 features🐛 4 fixes🔧 2 symbols
Summary
This release introduces OpenCL optimizations for MoE Mxfp4 on Adreno GPUs, including a new CLC kernel and router reordering, alongside various cleanup and bug fixes.
Migration Steps
- Removed 'putenv' call in llama-model.cpp.
✨ New Features
- Added MoE Mxfp4 CLC kernel and router reorder on GPU for Adreno optimization in OpenCL.
🐛 Bug Fixes
- Fixed precision issue in OpenCL path.
- Removed unnecessary headers in OpenCL.
- Removed unnecessary assert in OpenCL.
- Stopped saving cl_program objects in OpenCL.