b8392

📅 Mar 17, 2026📦 llama-cppView on GitHub →

🐛 2 fixes🔧 3 symbols

Summary

This release fixes a critical bug in the KLEIDIAI backend where batched 3D inputs for MUL_MAT operations were incorrectly rejected, leading to crashes during graph scheduling. Additionally, buffer checks during weight loading were relaxed.

🐛 Bug Fixes

Fixed an issue where the supports_op() check incorrectly rejected MUL_MAT operations with 3D inputs (batched inputs), causing crashes during graph scheduling for models with Q4_0/Q8_0 weights when n_seq_max > 1.
Relaxed the buffer check in supports_op() to allow it to be called during weight loading when the source buffer pointer is NULL.

Affected Symbols

kleidiai supports_op()compute_forward_qx()