b8832
📦 llama-cppView on GitHub →
✨ 3 features🔧 1 symbols
Summary
This release enhances CUDA graph management by implementing LRU eviction and increasing the graph limit. It also provides extensive pre-built binaries across multiple operating systems and hardware targets.
✨ New Features
- Implemented LRU-based eviction strategy for CUDA graphs.
- Increased the limit for CUDA graphs to 128.
- Introduced periodic clean-up mechanism for CUDA graphs.