b9434

📅 May 30, 2026📦 llama-cppView on GitHub →

🐛 2 fixes

Summary

This release focuses on fixing granularity issues for Qwen models under specific Tensor Parallelism configurations, particularly involving 3 GPUs, and resolves an afmoe TP bug.

🐛 Bug Fixes

Fixed granularity issues for Qwen 3.5/3.6 when using 3 GPUs.
Fixed an issue related to afmoe TP (Tensor Parallelism).