Change8

b8057

📦 llama-cppView on GitHub →
2 features🐛 7 fixes🔧 2 symbols

Summary

This release introduces significant performance enhancements to ggml-cpu via a new GEMM microkernel and addresses several low-level implementation details and warnings. It also provides extensive new pre-compiled binaries for various platforms and accelerators.

✨ New Features

  • Added a new GEMM microkernel implementation for ggml-cpu acceleration.
  • Introduced support for various new binary distributions including specific CUDA (12.4, 13.1), Vulkan, SYCL, and HIP builds for Windows, and specialized builds for openEuler.

🐛 Bug Fixes

  • Added a guard for sizeless vector types.
  • Fixed an issue where DV % GGML_F32_EPR was not zero.
  • Moved memset operations out of loops for potential performance improvement.
  • Used RM=4 for arm architecture optimization.
  • Converted elements in simd_gemm to int.
  • Converted types to size_t to resolve compiler warnings.
  • Added pragma to ignore aggressive loop optimizations.

Affected Symbols