Change8

b9319

📦 llama-cppView on GitHub →
3 features🐛 7 fixes🔧 5 symbols

Summary

This release introduces new GGUF initialization functions (`gguf_init_from_callback`, `gguf_init_from_buffer`) and resolves several memory management and offset calculation bugs within the GGUF reader implementation.

Migration Steps

  1. If using the GGUF reader callback, note that the `output` type for `gguf_reader_callback_t` is now `void *`, and `max_expected_size` and offsets are now `uint64_t`.

✨ New Features

  • Added ggml function `gguf_init_from_callback` for initializing GGUF from a callback.
  • Added ggml function `gguf_init_from_buffer` for initializing GGUF from a buffer.
  • Setting `max_chunk_read == 0` now means `SIZE_MAX` in GGUF loading.

🐛 Bug Fixes

  • Ensured memory breakdown for a model loaded with `no_alloc` from a file is consistent with being loaded from a buffer.
  • Removed `total_size` from `gguf_reader`.
  • Renamed `offset` to `data_offset` in file offset calculation.
  • Fixed issue where `gguf_reader_callback_t`'s `output` type was incorrect; changed to `void *` and updated `max_expected_size` and offsets to `uint64_t`.
  • Hardened against offset overflow during buffer read operations.
  • Removed seek behavior from the callback interface.
  • Fixed seeking issue when loading a GGUF file with no tensors.

Affected Symbols