b8931

📅 Apr 25, 2026📦 llama-cppView on GitHub →

🐛 2 fixes

Summary

This release focuses on performance improvements for CUDA by reducing MMQ stream-k overhead and updating internal integer usage for kbc calculations.