Change8

b8814

📦 llama-cppView on GitHub →
3 features🔧 1 symbols

Summary

This release introduces significant performance improvements to ggml-cpu by adding 128-bit RVV implementations for various quantization types and includes extensive pre-built binaries for numerous platforms.

Migration Steps

  1. Refactored ggml-cpu code and added rvv checks.

✨ New Features

  • Added 128-bit RVV implementation for Quantization Vector Dot in ggml-cpu.
  • Added 128-bit implementations for i-quants and ternary quants in ggml-cpu.
  • Added 128-bit implementations for iq2_xs, iq3_s, iq3_xxs, and tq2_0 quant types in ggml-cpu.

Affected Symbols