Change8

b9055

📦 llama-cppView on GitHub →
1 features🐛 9 fixes🔧 8 symbols

Summary

This release introduces support for the Mimo v2.5 model, accompanied by numerous fixes related to tensor manipulation, scaling, and GGUF conversion for this new architecture.

✨ New Features

  • Added support for the Mimo v2.5 model.

🐛 Bug Fixes

  • Fixed modify_tensors row split issue for mimo-v2.5.
  • Added missing add_attn_value_scale plumbing for mimo-v2.5.
  • Fixed TP dequant to correctly detect TP rows for mimo-v2.5.
  • Fixed TP iteration order to be descending for mimo-v2.5.
  • Retained fused qkv for mimo-v2.5.
  • Fixed missed attn_value scale during merge for mimo-v2.5.
  • Ensured fused QKV is contiguous for scaling attention value in mimo-v2.5.
  • Moved speech_embeddings. to TextModel filter_tensors for mimo-v2.5.
  • Included MTP weights in gguf conversion for mimo-v2.5.

Affected Symbols