b8116

📅 Feb 20, 2026📦 llama-cppView on GitHub →

✨ 3 features🐛 4 fixes🔧 4 symbols

Summary

This release introduces a --dry-run option for llama-quantize and refines internal tensor dimension handling and quantization logic, including new checks related to imatrix usage.

Migration Steps

If relying on the previous tensor dimension representation, note that it now uses 6 characters.
Review usage of quantization logic to account for the new `tensor_requires_imatrix` function and associated warnings.

✨ New Features

Added --dry-run option to llama-quantize for testing quantization without writing files.
Tensor dimensions are now represented using 6 characters.
Added a courtesy warning about imatrix usage when the new function `tensor_requires_imatrix` is relevant.

🐛 Bug Fixes

Fixed an issue where ggml_nbytes was unnecessarily recalculated for tensors.
Fixed an indentation issue.
Corrected a logic error.
Fixed an issue related to `tensor_requires_imatrix`.

Affected Symbols

llama-quantize tensor_requires_imatrix ggml_nbytes GGML_TYPE