b9558
📦 llama-cppView on GitHub →
✨ 2 features🐛 1 fixes🔧 1 symbols
Summary
This release introduces performance improvements for Vulkan by optimizing B matrix loads and increasing the BK size when enabled. It also includes necessary alignment fixes in the Vulkan implementation.
Migration Steps
- If compiling with Vulkan enabled, ensure that the B matrix alignment and stride are multiples of 4 in ggml-vulkan.cpp.
✨ New Features
- Enabled Vulkan optimization using cm2 decode_vector for mul_mat_id B matrix loads, which allows vec4 loads of B elements.
- Increased BK to 64 when the Vulkan optimization using cm2 decode_vector is enabled.
🐛 Bug Fixes
- Ensured B matrix alignment and stride are multiples of 4 in ggml-vulkan.cpp when using the new Vulkan optimization.