Change8

b8929

📦 llama-cppView on GitHub →
🔧 1 symbols

Summary

The default quantization type in llama-quant has been updated from Q5_1 to the more robust Q8_0 for improved stability when using default parameters. Numerous pre-compiled binaries for different operating systems and hardware configurations are also provided.

Migration Steps

  1. If relying on default quantization parameters in `llama_model_quantize_params`, be aware that the default `ftype` has changed from `LLAMA_FTYPE_MOSTLY_Q5_1` to `LLAMA_FTYPE_MOSTLY_Q8_0`.

Affected Symbols