Change8

b9330

📦 llama-cppView on GitHub →
🐛 1 fixes🔧 6 symbols

Summary

This release corrects the tensor operation tagging for ffn_latent in Nemotron models, resolving a loading issue that negatively impacted performance. Various pre-compiled binaries for different platforms are also provided.

🐛 Bug Fixes

  • Fixed an issue where ffn_latent was incorrectly tagged as MUL_MAT instead of GGML_OP_MUL for Nemotron models, leading to incorrect weight placement (GPU vs CPU) during loading and performance degradation.

Affected Symbols