b8522
📦 llama-cppView on GitHub →
✨ 1 features🔧 1 symbols
Summary
This release updates llama-bench to show offloaded layer information when using MoE models and provides a comprehensive set of pre-built binaries for diverse platforms and accelerators.
✨ New Features
- llama-bench now prints "-n-cpu-moe" when the number of offloaded layers exceeds 1.