Change8

b9735

📦 llama-cppView on GitHub →
1 features🔧 1 symbols

Summary

This release focuses on performance improvements in ggml by optimizing AMX operations, leading to speedups in quantization benchmarks on Intel Xeon CPUs. It also provides updated pre-built binaries for numerous platforms.

✨ New Features

  • ggml: Optimized AMX performance by flattening the partition over n_batch * M to ensure every thread participates in quantization.

Affected Symbols