b9459
📦 llama-cppView on GitHub →
✨ 2 features🔧 1 symbols
Summary
The Metal backend received significant updates to GLU kernel templating, enabling f16/f32 support and memory bandwidth optimization. Numerous pre-built binaries for various platforms and accelerators are provided.
✨ New Features
- Metal backend now uses a template for GLU kernels supporting f16/f32, optimizing memory bandwidth by loading/storing in native tensor type while keeping ALU compute in float.
- Dispatch gate opened to allow f16 inputs for Metal backend operations.