b7609

📅 Jan 2, 2026📦 llama-cppView on GitHub →

🐛 4 fixes🔧 3 symbols

Summary

This release fixes a critical bug in the CUDA backend that caused assertions when copying large tensors. It improves stability for large models by transitioning internal dimensions and byte counts to 64-bit integers.

Migration Steps

Update to version b7609 or higher to resolve CUDA memory copy assertions for large models

🐛 Bug Fixes

Fixed an assertion failure in ggml_cuda_cpy when copying large tensors exceeding INT_MAX bytes
Updated internal data types to int64_t in ggml-cuda to prevent overflow
Added safety asserts for CUDA block numbers
Refined conditions for y and z dimensions in CUDA kernels

🔧 Affected Symbols

ggml_cuda_cpyggml_nbytesggml-cuda