b9804
📦 llama-cppView on GitHub →
✨ 2 features🐛 2 fixes🔧 3 symbols
Summary
This release focuses on improving Mamba2 model conversion by removing the hardcoded 2x expansion factor and fixing an erroneous parameter check. It also provides updated pre-built binaries for broad platform compatibility.
Migration Steps
- If using Mamba2 conversion scripts, be aware that the expansion factor is now configurable and defaults to 2; ensure any custom logic relying on the hardcoded 2x factor is updated.
- Check for the presence of `mamba_expand` if custom logic depends on expansion settings.
✨ New Features
- Mamba2 model conversion now supports arbitrary expansion factors instead of a hardcoded 2x expansion.
- The `convert_hf_to_gguf.py` script now makes the expansion factor optional, defaulting to 2.
🐛 Bug Fixes
- Removed an invalid check in Mamba2 related to `d_inner % d_state` which was unrelated to the parameters.
- Applied the expansion factor fix to the refactored conversion logic in `mamba.py`.