Change8

October-2025

📦 unslothView on GitHub →
14 features🐛 24 fixes🔧 5 symbols

Summary

This release introduces major platform support, including Docker images and Blackwell/DGX hardware compatibility, alongside significant new features like Quantization-Aware Training (QAT) and extensive RL environment utilities.

Migration Steps

  1. Update Unsloth via `pip install --upgrade --force-reinstall --no-cache-dir --no-deps unsloth unsloth_zoo`.
  2. If you specifically want PyTorch 2.9, use: `pip install --upgrade unsloth unsloth_zoo`.

✨ New Features

  • Introduction of an official Unsloth Docker image for simplified training setup.
  • Support added for NVIDIA Blackwell and DGX Spark hardware.
  • Full support for Qwen3-VL models.
  • Support added for IBM Granite-4.0 models.
  • Introduction of Quantization-Aware Training (QAT) in collaboration with PyTorch, aiming to recover significant accuracy.
  • Support for OpenEnv to enable open Reinforcement Learning environments.
  • New customer support agent notebook demonstrating real-time analysis and training using Google Sheets data.
  • Support fixed for Python 3.13, PyTorch 2.9, and the latest Hugging Face TRL and transformers libraries.
  • Save to TorchAO functionality is now supported via `model.save_pretrained_torchao()` using configurations like `Int4WeightOnlyConfig()`.
  • New RL environment function `execute_with_time_limit` to enforce execution time limits on functions.
  • New RL Environment function `check_python_modules` to verify if a function uses only Python standard modules.
  • New RL Environment function `create_locked_down_function` to create functions without leakage of global variables.
  • New RL Environment utility `Benchmarker` for accurate function benchmarking by wiping L1 to L3 cache.
  • New RL Environment function `launch_openenv` to launch a continuously reloaded OpenEnv environment process on an available port.

🐛 Bug Fixes

  • Fixed Standby VRAM consumption issue; now auto-selects 80% to 95% GPU utilization when `UNSLOTH_VLLM_STANDBY="1"` is set.
  • Fixed GRPO training hangs with improved environment timers, ensuring compatibility with DGX Spark and other GPUs.
  • Fixed GRPO `RuntimeError: shape '[1, 887, 1, 128]' is invalid for input of size 3633152` for all models.
  • Fixed GPT-OSS BF16 issue where `GptOssTopKRouter` lacked the `weight` attribute when `load_in_4bit = True`.
  • Fixed Mistral training due to a sentencepiece proto issue (now compatible with any protobuf version).
  • Fixed evaluation when `UNSLOTH_RETURN_LOGITS="1"` is set (addressing issues #3126, #3071).
  • Fixed `Output 0 of UnslothFusedLossBackward is a view and is being modified inplace.` error for Gemma 3 when using `transformers>=4.57.1`.
  • Fixed `ImportError: cannot import name '_Ink' from 'PIL._typing'` by recommending users update and use new notebooks.
  • Fixed loading models in 8bit.
  • Fixed QAT configuration to use `Int8DynamicActivationIntxWeightConfig`.
  • Fixed Gemma 3 bugs.
  • Fixed Transformers compatibility issue due to rename from `PretrainedConfig` to `PreTrainedConfig` in v4.57.
  • Fixed evaluation metric issues.
  • Reinstated llama.cpp Compatibility and GGUF Conversion with multiple quantizations and automated Ollama Modelfile Creation.
  • Added vLLM FP8 quantized support for SFT/GRPO.
  • Applied AMD fixes.
  • Fixed Transformers 4.57.1 compatibility issues.
  • Fixed GRPO bugs.
  • Normalized EOL LF (unix line endings).
  • Fixed out of resources issue for llama3.2 sft on AMD GPU.
  • Fixed cross entropy loss issue for small vocab size on AMD GPU.
  • Fixed Gemma 3n issue.
  • Patched sleep mode properly for TRL.
  • Enabled Intel support for torch2.8.

🔧 Affected Symbols

GptOssTopKRouterInt4WeightOnlyConfigUnslothFusedLossBackwardPretrainedConfigPreTrainedConfig