b9521
📦 llama-cppView on GitHub →
✨ 1 features🐛 1 fixes🔧 1 symbols
Summary
This release focuses on performance improvements by enrolling the mul_mat_vec_q_moe operation into PDL, boosting MTP performance, alongside minor kernel overlap fixes.
✨ New Features
- Enrolled mul_mat_vec_q_moe into PDL for improved MTP performance on BW.
🐛 Bug Fixes
- LC is now set to overlap with following kernels.