b9498

📅 Jun 4, 2026📦 llama-cppView on GitHub →

✨ 4 features🐛 1 fixes🔧 1 symbols

Summary

This release focuses heavily on extending and improving RVV quantization support within ggml-cpu, adding higher VLEN implementations for various quantization schemes. It also provides updated pre-built binaries for numerous platforms.

✨ New Features

Extended ggml-cpu RVV quantization vector dot implementation to support higher VLENs.
Added RVV 512-bit and 1024-bit implementations for iq4_xs quantization in ggml-cpu.
Added RVV 512-bit and 1024-bit implementations for q6_K and i-quants in ggml-cpu.
Added 512-bit and 1024-bit implementations for tq3_s, iq3_xxs, iq2_s, iq2_xs, and iq2_xxs in ggml-cpu.

🐛 Bug Fixes

Improved iq2_xs implementation for RVV 256-bit.

Affected Symbols

ggml-cpu