Change8

b8116

📦 llama-cppView on GitHub →
3 features🐛 4 fixes🔧 4 symbols

Summary

This release introduces a --dry-run option for llama-quantize and refines internal tensor dimension handling and quantization logic, including new checks related to imatrix usage.

Migration Steps

  1. If relying on the previous tensor dimension representation, note that it now uses 6 characters.
  2. Review usage of quantization logic to account for the new `tensor_requires_imatrix` function and associated warnings.

✨ New Features

  • Added --dry-run option to llama-quantize for testing quantization without writing files.
  • Tensor dimensions are now represented using 6 characters.
  • Added a courtesy warning about imatrix usage when the new function `tensor_requires_imatrix` is relevant.

🐛 Bug Fixes

  • Fixed an issue where ggml_nbytes was unnecessarily recalculated for tensors.
  • Fixed an indentation issue.
  • Corrected a logic error.
  • Fixed an issue related to `tensor_requires_imatrix`.

Affected Symbols