Change8

b8495

📦 llama-cppView on GitHub →
5 features🐛 5 fixes🔧 4 symbols

Summary

This release focuses on general DMA and Binary Operation fixes for the Hexagon backend, improving handling of large strides and resolving issues in ssm-conv. It also introduces platform-specific binary updates for broad compatibility.

Migration Steps

  1. If relying on specific VTCM allocation behavior for ssm-conv, be aware that single-page allocation is now the default.
  2. If using hex-dma with large strides, the stride limitation is removed on v75+ due to the switch to 2d-wide mode.

✨ New Features

  • Chained DMA is now the default for hex-dma to support newer models.
  • Added uint32 dump helper for hexagon backend.
  • Hexagon now uses single-page VTCM allocation to resolve issues with large gather operations in ssm-conv.
  • Hex-dma now uses 1d mode for reshaping, supporting sizes up to 24-bits (>16MB).
  • Hex-dma starts using 2d-wide mode on v75 and up, removing the 16-bit stride limitation.

🐛 Bug Fixes

  • Fixed incorrect stride logic in hex-bin.
  • Ensured repack buffers are dumped for verbose level > 2.
  • Hex-bin now consistently uses dma_queue_push even for dummy destination transactions.
  • Hex-bin cleanup of kernel selection logic.
  • Hex-bin cleanup of binary op core and fix for transposed tensor handling.

Affected Symbols