b8089
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
The Vulkan backend was updated to split matrix multiplication operations into multiple dispatches to prevent overflow issues when handling large batch dimensions. This release also includes a comprehensive set of pre-compiled binaries for diverse operating systems and hardware configurations.
✨ New Features
- Vulkan backend now splits matrix multiplication (mul_mat) into multiple dispatches when batch dimensions exceed the max workgroup count limit, using push constants for base indexing.