Change8

b7600

📦 llama-cppView on GitHub →
4 features🐛 3 fixes🔧 5 symbols

Summary

This release enhances Vulkan backend support for Nemotron and DeepSeek-V2 models by extending topk_moe functionality and improving operator fusion testing.

✨ New Features

  • Extended Vulkan topk_moe to support sigmoid with exp_probs_b for Nemotron models.
  • Added support for GGML_OP_SCALE at the end of topk_moe for Nemotron and DeepSeek-V2.
  • Optimized Vulkan backend by reducing pipeline variants and specification constants in favor of push constants.
  • Enhanced test-backend-ops and ggml-backend to allow verification of multiple outputs in fusion tests.

🐛 Bug Fixes

  • Disabled sigmoid fusion for MoltenVK to prevent compatibility issues.
  • Updated test_topk_moe to allow results in arbitrary order, improving test reliability.
  • Fixed test_topk_moe exp_probs_b dimension to be 1D to match real network architectures.

🔧 Affected Symbols

topk_moeGGML_OP_SCALEtest-backend-opsggml-backendvulkan