b8964
📦 llama-cppView on GitHub →
🐛 1 fixes
Summary
This release fixes a critical bug where the reasoning budget was not correctly re-armed after a DONE state, leading to unbudgeted token usage in subsequent reasoning steps. It also provides extensive pre-compiled binaries for broad platform support.
🐛 Bug Fixes
- Re-armed the reasoning budget after a DONE state, fixing an issue where subsequent <think> blocks after the first one would run unbudgeted, especially observed with models like unsloth/Qwen3.6-27B-GGUF.