Change8

b7618

📦 llama-cppView on GitHub →
🐛 2 fixes🔧 4 symbols

Summary

This release focuses on critical bug fixes for the CUDA backend, specifically addressing memory pool allocation logic and integer casting in argsort to prevent computation errors.

Migration Steps

  1. Update llama.cpp to version b7618
  2. If using Windows with CUDA, ensure the corresponding CUDA 12.4 or 13.1 DLLs are updated if necessary

🐛 Bug Fixes

  • Fixed CUDA pool allocation where object byte size was incorrectly passed instead of object count for fattn-common and dst_tmp_meta
  • Fixed potential overflow in argsort by explicitly casting integer allocation counts before multiplication

🔧 Affected Symbols

ggml-cudafattn-commondst_tmp_metaargsort