b9558

📅 Jun 8, 2026📦 llama-cppView on GitHub →

✨ 2 features🐛 1 fixes🔧 1 symbols

Summary

This release introduces performance improvements for Vulkan by optimizing B matrix loads and increasing the BK size when enabled. It also includes necessary alignment fixes in the Vulkan implementation.

Migration Steps

If compiling with Vulkan enabled, ensure that the B matrix alignment and stride are multiples of 4 in ggml-vulkan.cpp.

✨ New Features

Enabled Vulkan optimization using cm2 decode_vector for mul_mat_id B matrix loads, which allows vec4 loads of B elements.
Increased BK to 64 when the Vulkan optimization using cm2 decode_vector is enabled.

🐛 Bug Fixes

Ensured B matrix alignment and stride are multiples of 4 in ggml-vulkan.cpp when using the new Vulkan optimization.

Affected Symbols

ggml-vulkan.cpp