June-2025
Breaking Changes📦 unslothView on GitHub →
⚠ 2 breaking✨ 16 features🐛 23 fixes🔧 9 symbols
Summary
This release introduces major new capabilities including support for multimodal Gemma 3n models and Text-to-Speech fine-tuning, alongside new quantization methods (Dynamic 2.0 GGUFs) and support for DeepSeek-R1-0528 and Magistral-24B.
⚠️ Breaking Changes
- Removed `dataset_text_field` from `SFTConfig`. Users should ensure they are not relying on this field being present.
- The SFTTrainer now favors `max_seq_length` over `max_length` in its configuration. If you explicitly set both, ensure `max_seq_length` is the intended value for sequence length.
Migration Steps
- Update Unsloth via `pip install --upgrade --force-reinstall unsloth unsloth_zoo`.
✨ New Features
- Support for Google's new Gemma 3n multimodal models (text, image, video & audio).
- Introduction of Text-to-Speech (TTS) and Speech-to-Text (STT) fine-tuning support for models like Sesame-CSM and Orpheus-TTS, offering 1.5x faster training and -50% VRAM usage.
- Support for DeepSeek-R1-0528-Qwen3 fine-tuning using GRPO with a new reward function that increases multilingual response rates by 40%+.
- Introduction of Dynamic 1-bit GGUFs for DeepSeek-R1-0528, shrinking the model size significantly (e.g., 715GB to 175GB).
- New Dynamic 2.0 GGUFs quantization method achieving SOTA performance on MMLU and KL Divergence by selectively quantizing layers.
- Advanced GRPO notebook for Qwen3 featuring proximity scoring for better reward functions and new Prefinetuning/priming to skip GRPO format learning.
- Fine-tuning support for Magistral-24B for advanced conversational reasoning.
- Fine-tuning support for Gemma3 vision models for multimodal tasks.
- MoE Kernel support added.
- Blackwell GPU architecture support added.
- Intel GPU support enabled across multiple patches.
- vLLM Windows CUDA support added and tested.
- Support for Sesame CSM added.
- Qwen-3 chat template and Ollama template support added.
- Llama4 MoE Grouped GEMM support added.
- Reward modeling update implemented.
🐛 Bug Fixes
- Fixed issue where the pixtral vision notebook failed during inference.
- Fixed trust remote code handling.
- Fixed typos in various places.
- Fixed issue with Qwen3 template double quote escapes.
- Fixed RoPE scaling unsupported error to display the model name.
- Fixed Whisper and ModernBERT issues.
- Improved error handling when llama.cpp build fails.
- Fixed SFT training compatibility with the latest TRL version.
- Fixed issue where `skip_prepare_dataset` was not checked before accessing dataset fields.
- Fixed quantization model parameter fetch regex.
- Fixed batched generation for prompts of different lengths.
- Fixed Unsloth checkpointing compatibility with latest transformers==4.52.x.
- Patched SFTTrainer to favor `max_seq_length` over `max_length` in config.
- Updated prepare 4d causal attention call.
- Fixed ignoring None Values when building vLLM subprocess_command.
- Added support for torch2.7.0 with Intel GPU.
- Made protobuf version constraint more flexible.
- Fixed renaming logic on models other than Llama.
- Enabled vLLM to share memory space.
- Fixed TRL 1.8.2 compatibility issues.
- Fixed AttributeError in GRPO trainer for models lacking an `llm` attribute.
- Fixed `grpo_compute_loss_slow` calculation.
- General GRPO fixes implemented.
🔧 Affected Symbols
SFTConfigWhisperSesame-CSMOrpheus-TTSDeepSeek-R1-0528-Qwen3Magistral-24BGemma3 vision modelsLoraConfigtrl.SFTTrainer