b8392
📦 llama-cppView on GitHub →
🐛 2 fixes🔧 3 symbols
Summary
This release fixes a critical bug in the KLEIDIAI backend where batched 3D inputs for MUL_MAT operations were incorrectly rejected, leading to crashes during graph scheduling. Additionally, buffer checks during weight loading were relaxed.
🐛 Bug Fixes
- Fixed an issue where the supports_op() check incorrectly rejected MUL_MAT operations with 3D inputs (batched inputs), causing crashes during graph scheduling for models with Q4_0/Q8_0 weights when n_seq_max > 1.
- Relaxed the buffer check in supports_op() to allow it to be called during weight loading when the source buffer pointer is NULL.