b8953
📦 llama-cppView on GitHub →
✨ 2 features🐛 1 fixes🔧 1 symbols
Summary
This release introduces Q1_0 support for ggml-webgpu, including a fast matmul kernel, and optimizes the Q1_0 initialization process. It also provides extensive pre-compiled binaries for macOS, Linux, Android, Windows, and openEuler.
✨ New Features
- Added Q1_0 support to ggml-webgpu.
- Implemented a fast matmul matvec q1_0 kernel.
🐛 Bug Fixes
- Removed redundant zero-fills in the ggml-webgpu Q1_0 shared memory initialization.