b9110
📦 llama-cppView on GitHub →
🐛 2 fixes🔧 6 symbols
Summary
This release updates documentation regarding server metrics and fixes a type issue for a newly added metric, while providing numerous pre-compiled binaries across different operating systems and hardware configurations.
🐛 Bug Fixes
- Fixed the type for the n_busy_slots_per_decode metric in the server.
- Updated server README to correctly describe the required model query parameter for router mode.