b9536
📦 llama-cppView on GitHub →
✨ 5 features🔧 4 symbols
Summary
This release focuses on significant performance improvements and optimizations within the OpenCL backend for various matrix operations. Several platform-specific builds were disabled.
✨ New Features
- OpenCL: Improved performance for get_rows, cpy, concat, and q6_k flat gemv operations.
- OpenCL: Enabled support for multiple workgroups for large rows.
- OpenCL: Improved performance for small copy operations (cpy).
- OpenCL: Implemented packed concatenation for small inputs.
- OpenCL: Tweaked flat q6_K gemv by increasing N_DST and remapping threads.