Change8

b8953

📦 llama-cppView on GitHub →
2 features🐛 1 fixes🔧 1 symbols

Summary

This release introduces Q1_0 support for ggml-webgpu, including a fast matmul kernel, and optimizes the Q1_0 initialization process. It also provides extensive pre-compiled binaries for macOS, Linux, Android, Windows, and openEuler.

✨ New Features

  • Added Q1_0 support to ggml-webgpu.
  • Implemented a fast matmul matvec q1_0 kernel.

🐛 Bug Fixes

  • Removed redundant zero-fills in the ggml-webgpu Q1_0 shared memory initialization.

Affected Symbols