b9271

📅 May 21, 2026📦 llama-cppView on GitHub →

🐛 1 fixes🔧 1 symbols

Summary

This release optimizes performance by skipping redundant logit computations during draft model follow-up decoding. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.

🐛 Bug Fixes

Fixed unnecessary logit computation during follow-up decode for the draft model by utilizing inp_out_ids to skip the computation.

Affected Symbols

mtp