Change8

March-2026

Breaking Changes
📦 unslothView on GitHub →
2 breaking18 features🐛 23 fixes🔧 3 symbols

Summary

This release introduces Unsloth Studio, a new open-source web UI for training and running LLMs locally, alongside significant performance improvements and broad model support including Mixtral and ROCM.

⚠️ Breaking Changes

  • Removed Blackwell flex attention disable workaround from studio. Users on Blackwell GPUs might need to verify performance or configuration if they relied on this workaround.
  • The `use_reentrant` parameter removal was applied to all TRL trainer configs. Code relying on this parameter in TRL trainers may need updates.

Migration Steps

  1. If using TRL trainers, review usage of the `use_reentrant` parameter as it has been removed.

✨ New Features

  • Introduction of Unsloth Studio, a new open-source web UI for training and running LLMs locally on Mac, Windows, and Linux.
  • Ability to compare and battle models side-by-side in Unsloth Studio.
  • Support for training 500+ models 2x faster with 70% less VRAM.
  • Support for GGUF, vision, audio, and embedding models.
  • Self-healing Tool calling / web search + code execution capabilities.
  • Auto-creation of datasets from PDF, CSV, DOCX files.
  • Export models to GGUF, safetensor, and other formats.
  • Support for Sequence Classification.
  • VLMs support for GRPO.
  • ROCM support added.
  • Mixtral model support added.
  • Refactoring of Attention mechanism.
  • Studio setup speed improvements using uv for installs (8x faster) and Ninja for llama.cpp (1.7x faster).
  • Windows Setup Improvements.
  • Studio users can now upload evaluation datasets.
  • Improved AI Assist with updated default model, output parsing, logging, and dataset mapping UX.
  • Studio now supports per-model inference defaults, GGUF slider fix, and reasoning toggle.
  • Studio UI improvements including SVG preview, fix for streaming and model selector bugs, and better onboarding UX/tooltips.

🐛 Bug Fixes

  • Fixed respecting `llm_int8_skip_modules` for VLM.
  • Prevented RCE via untrusted Hugging Face repos in ai-assist model config.
  • Disabled remote code execution in seed inspect dataset loads.
  • Fixed data-designer plugin installation to be non-editable for Colab compatibility.
  • Patched VLM trainer to resize images correctly.
  • Fixed Compare Mode Deadlock, Cancel Event Poisoning & IPC Optimization in Studio.
  • Fixed GGUF inference issues in studio regarding reasoning tokens, max_tokens, server flags, and GPU allocation.
  • Limited GGUF chat only for Mac devices.
  • Added max steps and epochs toggle switch in studio training configuration.
  • Fixed Colab plugin editable install issues.
  • Implemented graceful shutdown on Windows using signal handlers for Ctrl+C.
  • Fixed studio frontend build producing empty Tailwind CSS.
  • Fixed setup.sh crash on Mac when gitignore array is empty.
  • Fixed Ctrl+C not terminating backend process on Linux.
  • Fixed VLM GRPO matmul shape mismatch in `_get_per_token_logps_and_entropies`.
  • Resolved CUDA toolkit mismatch on multi-CUDA Windows systems.
  • Added Qwen3.5 version gate in loader dispatch.
  • Fixed xformers Blackwell guard with broader coverage and root cause documentation.
  • Fixed stale GGUF metadata, updated helper model, and improved auth in studio.
  • Studio now shows "Off" for repetition penalty = 1.
  • Studio updated Creative/Precise presets and shows "Off" for disabled samplers.
  • Fixed slow cancellation of GGUF generation in studio.
  • Removed unused `warmupToastShown` variable (TS6133).

Affected Symbols