Change8

b9459

📦 llama-cppView on GitHub →
2 features🔧 1 symbols

Summary

The Metal backend received significant updates to GLU kernel templating, enabling f16/f32 support and memory bandwidth optimization. Numerous pre-built binaries for various platforms and accelerators are provided.

✨ New Features

  • Metal backend now uses a template for GLU kernels supporting f16/f32, optimizing memory bandwidth by loading/storing in native tensor type while keeping ALU compute in float.
  • Dispatch gate opened to allow f16 inputs for Metal backend operations.

Affected Symbols