October-2025
📦 unslothView on GitHub →
✨ 14 features🐛 24 fixes🔧 5 symbols
Summary
This release introduces major platform support, including Docker images and Blackwell/DGX hardware compatibility, alongside significant new features like Quantization-Aware Training (QAT) and extensive RL environment utilities.
Migration Steps
- Update Unsloth via `pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo`.
- If you specifically want PyTorch 2.9, use: `pip install --upgrade unsloth unsloth_zoo`.
✨ New Features
- Introduction of an official Unsloth Docker image for simplified training setup.
- Support added for NVIDIA Blackwell and DGX Spark hardware.
- Full support for Qwen3-VL models.
- Support added for IBM Granite-4.0 models.
- Introduction of Quantization-Aware Training (QAT) in collaboration with PyTorch, aiming to recover significant accuracy.
- Support for OpenEnv to enable open Reinforcement Learning environments.
- New customer support agent notebook demonstrating real-time analysis and training using Google Sheets data.
- Support fixed for Python 3.13, PyTorch 2.9, and the latest Hugging Face TRL and transformers libraries.
- Save to TorchAO functionality is now supported via `model.save_pretrained_torchao()` using configurations like `Int4WeightOnlyConfig()`.
- New RL environment function `execute_with_time_limit` to enforce execution time limits on functions.
- New RL Environment function `check_python_modules` to verify if a function uses only Python standard modules.
- New RL Environment function `create_locked_down_function` to create functions without leakage of global variables.
- New RL Environment utility `Benchmarker` for accurate function benchmarking by wiping L1 to L3 cache.
- New RL Environment function `launch_openenv` to launch a continuously reloaded OpenEnv environment process on an available port.
🐛 Bug Fixes
- Fixed Standby VRAM consumption issue; now auto-selects 80% to 95% GPU utilization when `UNSLOTH_VLLM_STANDBY="1"` is set.
- Fixed GRPO training hangs with improved environment timers, ensuring compatibility with DGX Spark and other GPUs.
- Fixed GRPO `RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152` for all models.
- Fixed GPT-OSS BF16 issue where `GptOssTopKRouter` lacked the `weight` attribute when `load_in_4bit = True`.
- Fixed Mistral training due to a sentencepiece proto issue (now compatible with any protobuf version).
- Fixed evaluation when `UNSLOTH_RETURN_LOGITS="1"` is set (addressing issues #3126, #3071).
- Fixed `Output 0 of UnslothFusedLossBackward is a view and is being modified inplace.` error for Gemma 3 when using `transformers>=4.57.1`.
- Fixed `ImportError: cannot import name '_Ink' from 'PIL._typing'` by recommending users update and use new notebooks.
- Fixed loading models in 8bit.
- Fixed QAT configuration to use `Int8DynamicActivationIntxWeightConfig`.
- Fixed Gemma 3 bugs.
- Fixed Transformers compatibility issue due to rename from `PretrainedConfig` to `PreTrainedConfig` in v4.57.
- Fixed evaluation metric issues.
- Reinstated llama.cpp Compatibility and GGUF Conversion with multiple quantizations and automated Ollama Modelfile Creation.
- Added vLLM FP8 quantized support for SFT/GRPO.
- Applied AMD fixes.
- Fixed Transformers 4.57.1 compatibility issues.
- Fixed GRPO bugs.
- Normalized EOL LF (unix line endings).
- Fixed out of resources issue for llama3.2 sft on AMD GPU.
- Fixed cross entropy loss issue for small vocab size on AMD GPU.
- Fixed Gemma 3n issue.
- Patched sleep mode properly for TRL.
- Enabled Intel support for torch2.8.
🔧 Affected Symbols
GptOssTopKRouterInt4WeightOnlyConfigUnslothFusedLossBackwardPretrainedConfigPreTrainedConfig