b7618
📦 llama-cppView on GitHub →
🐛 2 fixes🔧 4 symbols
Summary
This release focuses on critical bug fixes for the CUDA backend, specifically addressing memory pool allocation logic and integer casting in argsort to prevent computation errors.
Migration Steps
- Update llama.cpp to version b7618
- If using Windows with CUDA, ensure the corresponding CUDA 12.4 or 13.1 DLLs are updated if necessary
🐛 Bug Fixes
- Fixed CUDA pool allocation where object byte size was incorrectly passed instead of object count for fattn-common and dst_tmp_meta
- Fixed potential overflow in argsort by explicitly casting integer allocation counts before multiplication
🔧 Affected Symbols
ggml-cudafattn-commondst_tmp_metaargsort