b8495

📅 Mar 23, 2026📦 llama-cppView on GitHub →

✨ 5 features🐛 5 fixes🔧 4 symbols

Summary

This release focuses on general DMA and Binary Operation fixes for the Hexagon backend, improving handling of large strides and resolving issues in ssm-conv. It also introduces platform-specific binary updates for broad compatibility.

Migration Steps

If relying on specific VTCM allocation behavior for ssm-conv, be aware that single-page allocation is now the default.
If using hex-dma with large strides, the stride limitation is removed on v75+ due to the switch to 2d-wide mode.

✨ New Features

Chained DMA is now the default for hex-dma to support newer models.
Added uint32 dump helper for hexagon backend.
Hexagon now uses single-page VTCM allocation to resolve issues with large gather operations in ssm-conv.
Hex-dma now uses 1d mode for reshaping, supporting sizes up to 24-bits (>16MB).
Hex-dma starts using 2d-wide mode on v75 and up, removing the 16-bit stride limitation.

🐛 Bug Fixes

Fixed incorrect stride logic in hex-bin.
Ensured repack buffers are dumped for verbose level > 2.
Hex-bin now consistently uses dma_queue_push even for dummy destination transactions.
Hex-bin cleanup of kernel selection logic.
Hex-bin cleanup of binary op core and fix for transposed tensor handling.

Affected Symbols

hex-dma hexagon ssm-conv hex-bin