b8814
📦 llama-cppView on GitHub →
✨ 3 features🔧 1 symbols
Summary
This release introduces significant performance improvements to ggml-cpu by adding 128-bit RVV implementations for various quantization types and includes extensive pre-built binaries for numerous platforms.
Migration Steps
- Refactored ggml-cpu code and added rvv checks.
✨ New Features
- Added 128-bit RVV implementation for Quantization Vector Dot in ggml-cpu.
- Added 128-bit implementations for i-quants and ternary quants in ggml-cpu.
- Added 128-bit implementations for iq2_xs, iq3_s, iq3_xxs, and tq2_0 quant types in ggml-cpu.