b9319

📅 May 25, 2026📦 llama-cppView on GitHub →

✨ 3 features🐛 7 fixes🔧 5 symbols

Summary

This release introduces new GGUF initialization functions (`gguf_init_from_callback`, `gguf_init_from_buffer`) and resolves several memory management and offset calculation bugs within the GGUF reader implementation.

Migration Steps

If using the GGUF reader callback, note that the `output` type for `gguf_reader_callback_t` is now `void *`, and `max_expected_size` and offsets are now `uint64_t`.

✨ New Features

Added ggml function `gguf_init_from_callback` for initializing GGUF from a callback.
Added ggml function `gguf_init_from_buffer` for initializing GGUF from a buffer.
Setting `max_chunk_read == 0` now means `SIZE_MAX` in GGUF loading.

🐛 Bug Fixes

Ensured memory breakdown for a model loaded with `no_alloc` from a file is consistent with being loaded from a buffer.
Removed `total_size` from `gguf_reader`.
Renamed `offset` to `data_offset` in file offset calculation.
Fixed issue where `gguf_reader_callback_t`'s `output` type was incorrect; changed to `void *` and updated `max_expected_size` and offsets to `uint64_t`.
Hardened against offset overflow during buffer read operations.
Removed seek behavior from the callback interface.
Fixed seeking issue when loading a GGUF file with no tensors.

Affected Symbols

gguf_init_from_callback gguf_init_from_buffer gguf_reader gguf_reader_callback_t GGML_UNUSED