b9557
📦 llama-cppView on GitHub →
🐛 3 fixes🔧 3 symbols
Summary
This release focuses on CUDA context management stability by resetting the device after reading memory size and refines buffer counting logic. Several platform builds have been disabled.
Migration Steps
- If you rely on the previous behavior of backend_free function placement, note that it has been moved.
🐛 Bug Fixes
- Reset CUDA device in get_memory function if no backend is active.
- Count device and host buffers in memory size calculation.
- Exclude hip and musa from buffer counting and device reset operations in CUDA context management.