May-2025
📦 unslothView on GitHub →
✨ 5 features🐛 12 fixes🔧 5 symbols
Summary
This release introduces official support for Qwen3 models, including fine-tuning capabilities for the 30B MoE variant. Numerous bug fixes address compatibility issues, quantization errors, and improve overall stability.
Migration Steps
- Update Unsloth via `pip install --upgrade --force-reinstall unsloth unsloth_zoo`.
✨ New Features
- Added support for Qwen3 models, including 14B and 4B variants.
- The 30B MoE model ("unsloth/Qwen3-30B-A3B") is now fine-tunable in Unsloth.
- Introduced support for full finetuning via `full_finetuning = True` parameter in `FastModel.from_pretrained`.
- Added support for custom `auto_model` for wider model compatibility (e.g., Whisper, Bert).
- Added support for passing `attn_implementation` argument for attention configuration.
🐛 Bug Fixes
- Fixed issue where `load_in_4bit = True` conflicted with `fast_inference = True`.
- Resolved issue where `unsloth_fast_generate` model was not defined.
- Ensured `trust_remote_code` propagates correctly down to `unsloth_compile_transformers`.
- Improved error reporting by showing `peft_error` when applicable.
- Fixed generation prompt error message handling.
- Fixed configuration issue related to `config.torch_dtype` in `LlamaModel_fast_forward_inference`.
- Updated support for new FFT 8bit functionality.
- Fixed a `NameError` by adding a missing `importlib` import in utilities.
- Added QLoRA Train and Merge16bit Test functionality.
- Fixed compatibility issues with Transformers version 4.45.
- Fixed saving 4bit quantization for Vision-Language Models (VLM).
- Applied various fixes specific to Qwen3 inference and Qwen3 QK norm.