September-2025-v3
Breaking Changes📦 unslothView on GitHub →
⚠ 1 breaking✨ 4 features🐛 4 fixes🔧 2 symbols
Summary
This release introduces significant performance gains and new capabilities for gpt-oss Reinforcement Learning, alongside support for new models like DeepSeek-V3.1-Terminus and Magistral 1.2. Several bug fixes were implemented, including resolving issues with BERT and QAT + LoRA fast path.
⚠️ Breaking Changes
- Transformers inference code was rewritten to enable faster RL inference for gpt-oss, as it is not yet vLLM compatible. Users relying on the previous inference implementation for gpt-oss RL might see changes in behavior or performance characteristics.
Migration Steps
- Ensure you have the latest version of transformers installed to support fine-tuning Qwen3 models.
- Do NOT use Flash Attention 3 when training gpt-oss models, as it will result in incorrect training loss.
✨ New Features
- Introduction of gpt-oss RL support, offering the fastest inference (~3x faster), lowest VRAM (50% less), and most context (8x longer) compared to other implementations.
- Release of DeepSeek-V3.1-Terminus, available locally via GGUF.
- Release of Magistral 1.2, available for local use or fine-tuning.
- Support for fine-tuning new Qwen3 models including Qwen3-VL, Qwen3-Omni, and Qwen3-Next (requires latest transformers).
🐛 Bug Fixes
- Fixed loading issues for BERT.
- Fixed QAT + LoRA fast path.
- Applied gemma3n embedder patch and adjusted FORCE_FLOAT32 match logic.
- Fixed BERT functionality.
🔧 Affected Symbols
Transformers inference code (rewritten for gpt-oss RL)BERT