October-2025

📅 Oct 27, 2025📦 unslothView on GitHub →

✨ 14 features🐛 24 fixes🔧 5 symbols

Summary

This release introduces major platform support, including Docker images and Blackwell/DGX hardware compatibility, alongside significant new features like Quantization-Aware Training (QAT) and extensive RL environment utilities.

Migration Steps

Update Unsloth via `pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo`.
If you specifically want PyTorch 2.9, use: `pip install --upgrade unsloth unsloth_zoo`.

✨ New Features

Introduction of an official Unsloth Docker image for simplified training setup.
Support added for NVIDIA Blackwell and DGX Spark hardware.
Full support for Qwen3-VL models.
Support added for IBM Granite-4.0 models.
Introduction of Quantization-Aware Training (QAT) in collaboration with PyTorch, aiming to recover significant accuracy.
Support for OpenEnv to enable open Reinforcement Learning environments.
New customer support agent notebook demonstrating real-time analysis and training using Google Sheets data.
Support fixed for Python 3.13, PyTorch 2.9, and the latest Hugging Face TRL and transformers libraries.
Save to TorchAO functionality is now supported via `model.save_pretrained_torchao()` using configurations like `Int4WeightOnlyConfig()`.
New RL environment function `execute_with_time_limit` to enforce execution time limits on functions.
New RL Environment function `check_python_modules` to verify if a function uses only Python standard modules.
New RL Environment function `create_locked_down_function` to create functions without leakage of global variables.
New RL Environment utility `Benchmarker` for accurate function benchmarking by wiping L1 to L3 cache.
New RL Environment function `launch_openenv` to launch a continuously reloaded OpenEnv environment process on an available port.

🐛 Bug Fixes

Fixed Standby VRAM consumption issue; now auto-selects 80% to 95% GPU utilization when `UNSLOTH_VLLM_STANDBY="1"` is set.
Fixed GRPO training hangs with improved environment timers, ensuring compatibility with DGX Spark and other GPUs.
Fixed GRPO `RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152` for all models.
Fixed GPT-OSS BF16 issue where `GptOssTopKRouter` lacked the `weight` attribute when `load_in_4bit = True`.
Fixed Mistral training due to a sentencepiece proto issue (now compatible with any protobuf version).
Fixed evaluation when `UNSLOTH_RETURN_LOGITS="1"` is set (addressing issues #3126, #3071).
Fixed `Output 0 of UnslothFusedLossBackward is a view and is being modified inplace.` error for Gemma 3 when using `transformers>=4.57.1`.
Fixed `ImportError: cannot import name '_Ink' from 'PIL._typing'` by recommending users update and use new notebooks.
Fixed loading models in 8bit.
Fixed QAT configuration to use `Int8DynamicActivationIntxWeightConfig`.
Fixed Gemma 3 bugs.
Fixed Transformers compatibility issue due to rename from `PretrainedConfig` to `PreTrainedConfig` in v4.57.
Fixed evaluation metric issues.
Reinstated llama.cpp Compatibility and GGUF Conversion with multiple quantizations and automated Ollama Modelfile Creation.
Added vLLM FP8 quantized support for SFT/GRPO.
Applied AMD fixes.
Fixed Transformers 4.57.1 compatibility issues.
Fixed GRPO bugs.
Normalized EOL LF (unix line endings).
Fixed out of resources issue for llama3.2 sft on AMD GPU.
Fixed cross entropy loss issue for small vocab size on AMD GPU.
Fixed Gemma 3n issue.
Patched sleep mode properly for TRL.
Enabled Intel support for torch2.8.

Affected Symbols

GptOssTopKRouter Int4WeightOnlyConfig UnslothFusedLossBackward PretrainedConfig PreTrainedConfig