b8646
📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols
Summary
This release focuses on internal memory management within the RPC system by reusing compute graph buffers, which mitigates a CUDA backend memory leak. Numerous pre-built binaries are provided across multiple platforms and hardware configurations.
🐛 Bug Fixes
- Reused compute graph buffers in RPC to partially address a memory leak caused by the CUDA backend using buffer addresses as cache keys.