b9735

📅 Jun 20, 2026📦 llama-cppView on GitHub →

✨ 1 features🔧 1 symbols

Summary

This release focuses on performance improvements in ggml by optimizing AMX operations, leading to speedups in quantization benchmarks on Intel Xeon CPUs. It also provides updated pre-built binaries for numerous platforms.

✨ New Features

ggml: Optimized AMX performance by flattening the partition over n_batch * M to ensure every thread participates in quantization.

Affected Symbols

ggml