b7600

📅 Jan 1, 2026📦 llama-cppView on GitHub →

✨ 4 features🐛 3 fixes🔧 5 symbols

Summary

This release enhances Vulkan backend support for Nemotron and DeepSeek-V2 models by extending topk_moe functionality and improving operator fusion testing.

✨ New Features

Extended Vulkan topk_moe to support sigmoid with exp_probs_b for Nemotron models.
Added support for GGML_OP_SCALE at the end of topk_moe for Nemotron and DeepSeek-V2.
Optimized Vulkan backend by reducing pipeline variants and specification constants in favor of push constants.
Enhanced test-backend-ops and ggml-backend to allow verification of multiple outputs in fusion tests.

🐛 Bug Fixes

Disabled sigmoid fusion for MoltenVK to prevent compatibility issues.
Updated test_topk_moe to allow results in arbitrary order, improving test reliability.
Fixed test_topk_moe exp_probs_b dimension to be 1D to match real network architectures.

🔧 Affected Symbols

topk_moeGGML_OP_SCALEtest-backend-opsggml-backendvulkan