b8964

📅 Apr 28, 2026📦 llama-cppView on GitHub →

🐛 1 fixes

Summary

This release fixes a critical bug where the reasoning budget was not correctly re-armed after a DONE state, leading to unbudgeted token usage in subsequent reasoning steps. It also provides extensive pre-compiled binaries for broad platform support.

🐛 Bug Fixes

Re-armed the reasoning budget after a DONE state, fixing an issue where subsequent <think> blocks after the first one would run unbudgeted, especially observed with models like unsloth/Qwen3.6-27B-GGUF.