b9557

📅 Jun 8, 2026📦 llama-cppView on GitHub →

🐛 3 fixes🔧 3 symbols

Summary

This release focuses on CUDA context management stability by resetting the device after reading memory size and refines buffer counting logic. Several platform builds have been disabled.

Migration Steps

If you rely on the previous behavior of backend_free function placement, note that it has been moved.

🐛 Bug Fixes

Reset CUDA device in get_memory function if no backend is active.
Count device and host buffers in memory size calculation.
Exclude hip and musa from buffer counting and device reset operations in CUDA context management.

Affected Symbols

cuda context management get_memory function backend_free function