b8082
📦 llama-cppView on GitHub →
✨ 1 features🔧 2 symbols
Summary
This release enables CUDA graphs for MMID operations when the batch size is small (1 to 4) and includes various pre-compiled binaries for different operating systems and hardware configurations.
✨ New Features
- Enabled CUDA graphs for MMID when batch size (BS) is between 1 and 4 (inclusive).