b7600
📦 llama-cppView on GitHub →
✨ 4 features🐛 3 fixes🔧 5 symbols
Summary
This release enhances Vulkan backend support for Nemotron and DeepSeek-V2 models by extending topk_moe functionality and improving operator fusion testing.
✨ New Features
- Extended Vulkan topk_moe to support sigmoid with exp_probs_b for Nemotron models.
- Added support for GGML_OP_SCALE at the end of topk_moe for Nemotron and DeepSeek-V2.
- Optimized Vulkan backend by reducing pipeline variants and specification constants in favor of push constants.
- Enhanced test-backend-ops and ggml-backend to allow verification of multiple outputs in fusion tests.
🐛 Bug Fixes
- Disabled sigmoid fusion for MoltenVK to prevent compatibility issues.
- Updated test_topk_moe to allow results in arbitrary order, improving test reliability.
- Fixed test_topk_moe exp_probs_b dimension to be 1D to match real network architectures.
🔧 Affected Symbols
topk_moeGGML_OP_SCALEtest-backend-opsggml-backendvulkan