Change8

b9521

📦 llama-cppView on GitHub →
1 features🐛 1 fixes🔧 1 symbols

Summary

This release focuses on performance improvements by enrolling the mul_mat_vec_q_moe operation into PDL, boosting MTP performance, alongside minor kernel overlap fixes.

✨ New Features

  • Enrolled mul_mat_vec_q_moe into PDL for improved MTP performance on BW.

🐛 Bug Fixes

  • LC is now set to overlap with following kernels.

Affected Symbols