b9089
📦 llama-cppView on GitHub →
✨ 1 features🔧 5 symbols
Summary
This release introduces performance improvements by reducing allocation overhead in SYCL flash attention and refactors related SYCL implementation files.
✨ New Features
- Reduced allocation overhead during SYCL flash attention operations.