b9550
📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols
Summary
This release addresses a critical bug related to K/V cache management when context sizes vary, ensuring tensor view sizes remain consistent. It also provides numerous pre-compiled binaries for various operating systems and hardware configurations.
🐛 Bug Fixes
- Fixed an issue where oversized assistant views could overflow shared K/V tensors and trigger a size assert during graph reserve when a fitted target context was smaller than the draft default.