b8751
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
This release introduces an update to Gemma 4 model loading, making shared-KV tail attention tensors optional. It also provides a comprehensive set of pre-compiled binaries for diverse platforms and hardware configurations.
✨ New Features
- Made Gemma 4 shared-KV tail attn_k tensors optional on load.