b9626
📦 llama-cppView on GitHub →
✨ 3 features🐛 4 fixes🔧 16 symbols
Summary
This release introduces architecture support for cohere2-MoE and includes several internal cleanups and fixes related to model loading, MTP, and pattern handling. Several naming conventions were updated, including renaming cohere2-moe tokenizer type.
Migration Steps
- If using the old tokenizer type for cohere2-moe, note that it has been removed and replaced by tiny_aya. North-Mini-Code-1.0 has been renamed.
✨ New Features
- Added architecture support for cohere2-MoE.
- Added support for Command models to use LayerNorm by checking for zerobios tensors.
- Added cohere2moe to Llama Model Saver supported list.
🐛 Bug Fixes
- Fixed sliding_window_pattern issue and pattern.
- Fixed transformers crash related to 'first_k_dense_replace' error.
- Fixed MTP fail by changing to use iSWA.
- Fixed remaining todos related to cohere2moe renaming and SWA parsing.