Change8

b9006

📦 llama-cppView on GitHub →
1 features🐛 4 fixes🔧 2 symbols

Summary

This release introduces OpenCL optimizations for MoE Mxfp4 on Adreno GPUs, including a new CLC kernel and router reordering, alongside various cleanup and bug fixes.

Migration Steps

  1. Removed 'putenv' call in llama-model.cpp.

✨ New Features

  • Added MoE Mxfp4 CLC kernel and router reorder on GPU for Adreno optimization in OpenCL.

🐛 Bug Fixes

  • Fixed precision issue in OpenCL path.
  • Removed unnecessary headers in OpenCL.
  • Removed unnecessary assert in OpenCL.
  • Stopped saving cl_program objects in OpenCL.

Affected Symbols