b8116
📦 llama-cppView on GitHub →
✨ 3 features🐛 4 fixes🔧 4 symbols
Summary
This release introduces a --dry-run option for llama-quantize and refines internal tensor dimension handling and quantization logic, including new checks related to imatrix usage.
Migration Steps
- If relying on the previous tensor dimension representation, note that it now uses 6 characters.
- Review usage of quantization logic to account for the new `tensor_requires_imatrix` function and associated warnings.
✨ New Features
- Added --dry-run option to llama-quantize for testing quantization without writing files.
- Tensor dimensions are now represented using 6 characters.
- Added a courtesy warning about imatrix usage when the new function `tensor_requires_imatrix` is relevant.
🐛 Bug Fixes
- Fixed an issue where ggml_nbytes was unnecessarily recalculated for tensors.
- Fixed an indentation issue.
- Corrected a logic error.
- Fixed an issue related to `tensor_requires_imatrix`.