Change8

b9804

📦 llama-cppView on GitHub →
2 features🐛 2 fixes🔧 3 symbols

Summary

This release focuses on improving Mamba2 model conversion by removing the hardcoded 2x expansion factor and fixing an erroneous parameter check. It also provides updated pre-built binaries for broad platform compatibility.

Migration Steps

  1. If using Mamba2 conversion scripts, be aware that the expansion factor is now configurable and defaults to 2; ensure any custom logic relying on the hardcoded 2x factor is updated.
  2. Check for the presence of `mamba_expand` if custom logic depends on expansion settings.

✨ New Features

  • Mamba2 model conversion now supports arbitrary expansion factors instead of a hardcoded 2x expansion.
  • The `convert_hf_to_gguf.py` script now makes the expansion factor optional, defaulting to 2.

🐛 Bug Fixes

  • Removed an invalid check in Mamba2 related to `d_inner % d_state` which was unrelated to the parameters.
  • Applied the expansion factor fix to the refactored conversion logic in `mamba.py`.

Affected Symbols