b9550

📅 Jun 7, 2026📦 llama-cppView on GitHub →

🐛 1 fixes🔧 1 symbols

Summary

This release addresses a critical bug related to K/V cache management when context sizes vary, ensuring tensor view sizes remain consistent. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.

🐛 Bug Fixes

Fixed an issue where oversized assistant views could overflow shared K/V tensors and trigger a size assert during graph reserve when a fitted target context was smaller than the draft default.

Affected Symbols

kv-cache