Change8

b8964

📦 llama-cppView on GitHub →
🐛 1 fixes

Summary

This release fixes a critical bug where the reasoning budget was not correctly re-armed after a DONE state, leading to unbudgeted token usage in subsequent reasoning steps. It also provides extensive pre-compiled binaries for broad platform support.

🐛 Bug Fixes

  • Re-armed the reasoning budget after a DONE state, fixing an issue where subsequent <think> blocks after the first one would run unbudgeted, especially observed with models like unsloth/Qwen3.6-27B-GGUF.