Change8

b8183

📦 llama-cppView on GitHub →
🐛 1 fixes🔧 1 symbols

Summary

This release includes a fix for CUDA kernels by capping grid.y at 65535 in non-contiguous dequantize/convert operations and provides updated pre-built binaries for numerous platforms.

🐛 Bug Fixes

  • Capped grid.y at 65535 in non-contiguous dequantize/convert kernels on CUDA.

Affected Symbols