Change8

b9484

📦 llama-cppView on GitHub →
1 features

Summary

This release introduces an optimization for OpenCL by using flat variants of q4_K and q6_K gemv for large matrix multiplications, alongside providing numerous pre-compiled binaries for various platforms.

✨ New Features

  • OpenCL: Use flat variants of q4_K and q6_K gemv for very large M.