Change8

b9626

📦 llama-cppView on GitHub →
3 features🐛 4 fixes🔧 16 symbols

Summary

This release introduces architecture support for cohere2-MoE and includes several internal cleanups and fixes related to model loading, MTP, and pattern handling. Several naming conventions were updated, including renaming cohere2-moe tokenizer type.

Migration Steps

  1. If using the old tokenizer type for cohere2-moe, note that it has been removed and replaced by tiny_aya. North-Mini-Code-1.0 has been renamed.

✨ New Features

  • Added architecture support for cohere2-MoE.
  • Added support for Command models to use LayerNorm by checking for zerobios tensors.
  • Added cohere2moe to Llama Model Saver supported list.

🐛 Bug Fixes

  • Fixed sliding_window_pattern issue and pattern.
  • Fixed transformers crash related to 'first_k_dense_replace' error.
  • Fixed MTP fail by changing to use iSWA.
  • Fixed remaining todos related to cohere2moe renaming and SWA parsing.

Affected Symbols