March-2026
Breaking Changes📦 unslothView on GitHub →
⚠ 2 breaking✨ 18 features🐛 23 fixes🔧 3 symbols
Summary
This release introduces Unsloth Studio, a new open-source web UI for training and running LLMs locally, alongside significant performance improvements and broad model support including Mixtral and ROCM.
⚠️ Breaking Changes
- Removed Blackwell flex attention disable workaround from studio. Users on Blackwell GPUs might need to verify performance or configuration if they relied on this workaround.
- The `use_reentrant` parameter removal was applied to all TRL trainer configs. Code relying on this parameter in TRL trainers may need updates.
Migration Steps
- If using TRL trainers, review usage of the `use_reentrant` parameter as it has been removed.
✨ New Features
- Introduction of Unsloth Studio, a new open-source web UI for training and running LLMs locally on Mac, Windows, and Linux.
- Ability to compare and battle models side-by-side in Unsloth Studio.
- Support for training 500+ models 2x faster with 70% less VRAM.
- Support for GGUF, vision, audio, and embedding models.
- Self-healing Tool calling / web search + code execution capabilities.
- Auto-creation of datasets from PDF, CSV, DOCX files.
- Export models to GGUF, safetensor, and other formats.
- Support for Sequence Classification.
- VLMs support for GRPO.
- ROCM support added.
- Mixtral model support added.
- Refactoring of Attention mechanism.
- Studio setup speed improvements using uv for installs (8x faster) and Ninja for llama.cpp (1.7x faster).
- Windows Setup Improvements.
- Studio users can now upload evaluation datasets.
- Improved AI Assist with updated default model, output parsing, logging, and dataset mapping UX.
- Studio now supports per-model inference defaults, GGUF slider fix, and reasoning toggle.
- Studio UI improvements including SVG preview, fix for streaming and model selector bugs, and better onboarding UX/tooltips.
🐛 Bug Fixes
- Fixed respecting `llm_int8_skip_modules` for VLM.
- Prevented RCE via untrusted Hugging Face repos in ai-assist model config.
- Disabled remote code execution in seed inspect dataset loads.
- Fixed data-designer plugin installation to be non-editable for Colab compatibility.
- Patched VLM trainer to resize images correctly.
- Fixed Compare Mode Deadlock, Cancel Event Poisoning & IPC Optimization in Studio.
- Fixed GGUF inference issues in studio regarding reasoning tokens, max_tokens, server flags, and GPU allocation.
- Limited GGUF chat only for Mac devices.
- Added max steps and epochs toggle switch in studio training configuration.
- Fixed Colab plugin editable install issues.
- Implemented graceful shutdown on Windows using signal handlers for Ctrl+C.
- Fixed studio frontend build producing empty Tailwind CSS.
- Fixed setup.sh crash on Mac when gitignore array is empty.
- Fixed Ctrl+C not terminating backend process on Linux.
- Fixed VLM GRPO matmul shape mismatch in `_get_per_token_logps_and_entropies`.
- Resolved CUDA toolkit mismatch on multi-CUDA Windows systems.
- Added Qwen3.5 version gate in loader dispatch.
- Fixed xformers Blackwell guard with broader coverage and root cause documentation.
- Fixed stale GGUF metadata, updated helper model, and improved auth in studio.
- Studio now shows "Off" for repetition penalty = 1.
- Studio updated Creative/Precise presets and shows "Off" for disabled samplers.
- Fixed slow cancellation of GGUF generation in studio.
- Removed unused `warmupToastShown` variable (TS6133).