b8814

📅 Apr 16, 2026📦 llama-cppView on GitHub →

✨ 3 features🔧 1 symbols

Summary

This release introduces significant performance improvements to ggml-cpu by adding 128-bit RVV implementations for various quantization types and includes extensive pre-built binaries for numerous platforms.

Migration Steps

Refactored ggml-cpu code and added rvv checks.

✨ New Features

Added 128-bit RVV implementation for Quantization Vector Dot in ggml-cpu.
Added 128-bit implementations for i-quants and ternary quants in ggml-cpu.
Added 128-bit implementations for iq2_xs, iq3_s, iq3_xxs, and tq2_0 quant types in ggml-cpu.

Affected Symbols

ggml-cpu