Change8

b8832

📦 llama-cppView on GitHub →
3 features🔧 1 symbols

Summary

This release enhances CUDA graph management by implementing LRU eviction and increasing the graph limit. It also provides extensive pre-built binaries across multiple operating systems and hardware targets.

✨ New Features

  • Implemented LRU-based eviction strategy for CUDA graphs.
  • Increased the limit for CUDA graphs to 128.
  • Introduced periodic clean-up mechanism for CUDA graphs.

Affected Symbols