Change8

b9574

📦 llama-cppView on GitHub →
1 features🐛 1 fixes🔧 1 symbols

Summary

This release focuses on server stability by ensuring VRAM caches are correctly exported to RAM for idle slots, preventing redundant processing. It also provides numerous updated pre-built binaries across supported platforms.

✨ New Features

  • Server component now ensures idle slots always export their VRAM cache to RAM to prevent needless preprocessing in subsequent slots.

🐛 Bug Fixes

  • Fixed an issue where slots without a unified KV cache were being cleared unnecessarily.

Affected Symbols