b9873
📦 llama-cppView on GitHub →
🐛 1 fixes🔧 5 symbols
Summary
This release fixes a critical bug where K/V rotation inputs could cause an abort if the buffer was unallocated during graph processing. It also provides numerous pre-compiled binaries for different operating systems and hardware configurations.
🐛 Bug Fixes
- Added a guard check for K/V rotation inputs in llm_graph_input_attn_kv::set_input and llm_graph_input_attn_kv_iswa::set_input to prevent aborts when the tensor buffer is unallocated (NULL) during graph operations like DFlash speculative decoding's KV-injection pass.