b9574
📦 llama-cppView on GitHub →
✨ 1 features🐛 1 fixes🔧 1 symbols
Summary
This release focuses on server stability by ensuring VRAM caches are correctly exported to RAM for idle slots, preventing redundant processing. It also provides numerous updated pre-built binaries across supported platforms.
✨ New Features
- Server component now ensures idle slots always export their VRAM cache to RAM to prevent needless preprocessing in subsequent slots.
🐛 Bug Fixes
- Fixed an issue where slots without a unified KV cache were being cleared unnecessarily.