b9279

📅 May 22, 2026📦 llama-cppView on GitHub →

✨ 1 features🐛 5 fixes🔧 4 symbols

Summary

This release introduces significant performance improvements to the Vulkan backend by fusing the snake activation sequence into a single kernel. Several internal refinements were made to the fusion logic, including stricter type and dimension checks.

Migration Steps

If relying on Vulkan snake fusion, ensure broadcast operands (a and inv_b) are GGML_TYPE_F32, as the previous type check was relaxed.
If using snake activation patterns, be aware that fusion will now be rejected if dimensions ne[2] or ne[3] are greater than 1.

✨ New Features

Vulkan backend now fuses the 5-operation snake activation sequence (mul, sin, sqr, mul, add) into a single elementwise kernel for improved performance, recognized for audio decoders like BigVGAN and Vocos.

🐛 Bug Fixes

Tightened `ggml_vk_can_fuse_snake` requirements: now mandates contiguous x and dst tensors, and requires broadcast operands a / inv_b to be tightly packed on the broadcast dim.
Rejected snake fusion when dimension ne[2] or ne[3] > 1.
Updated Vulkan shader naming conventions (T/C renamed to ne0/ne1) and push constants to align with standard Vulkan backend naming.
Refactored C++ side: `ggml_vk_can_fuse_snake` reuses `snake_pattern` constant, broadcast operands a and inv_b are now strictly required to be GGML_TYPE_F32 to match hardcoded float bindings.
Replaced silent f32 fallback in `ggml_vk_snake_dispatch_fused` with an explicit GGML_TYPE_F32 case and GGML_ABORT on default.

Affected Symbols

ggml_vk_snake_dispatch_fused ggml_vk_can_fuse_snake snake.comp sin_node