b9319
📦 llama-cppView on GitHub →
✨ 3 features🐛 7 fixes🔧 5 symbols
Summary
This release introduces new GGUF initialization functions (`gguf_init_from_callback`, `gguf_init_from_buffer`) and resolves several memory management and offset calculation bugs within the GGUF reader implementation.
Migration Steps
- If using the GGUF reader callback, note that the `output` type for `gguf_reader_callback_t` is now `void *`, and `max_expected_size` and offsets are now `uint64_t`.
✨ New Features
- Added ggml function `gguf_init_from_callback` for initializing GGUF from a callback.
- Added ggml function `gguf_init_from_buffer` for initializing GGUF from a buffer.
- Setting `max_chunk_read == 0` now means `SIZE_MAX` in GGUF loading.
🐛 Bug Fixes
- Ensured memory breakdown for a model loaded with `no_alloc` from a file is consistent with being loaded from a buffer.
- Removed `total_size` from `gguf_reader`.
- Renamed `offset` to `data_offset` in file offset calculation.
- Fixed issue where `gguf_reader_callback_t`'s `output` type was incorrect; changed to `void *` and updated `max_expected_size` and offsets to `uint64_t`.
- Hardened against offset overflow during buffer read operations.
- Removed seek behavior from the callback interface.
- Fixed seeking issue when loading a GGUF file with no tensors.