b9789

📅 Jun 25, 2026📦 llama-cppView on GitHub →

🐛 1 fixes

Summary

This release addresses a bug related to quantizing MoE models using MTP. It also provides updated pre-built binaries for macOS, Linux, Android, Windows, and UI components.

🐛 Bug Fixes

Fixed quantization issue when using Mixture of Experts (MoE) models with MTP (Multi-Threaded Processing).