v0.1.2-beta

📅 Mar 25, 2026📦 unslothView on GitHub →

✨ 19 features🐛 13 fixes🔧 6 symbols

Summary

This release focuses heavily on Unsloth Studio improvements, including major performance gains via pre-compiled binaries, enhanced tool calling, and robust fixes for Windows and Colab environments. Installation sizes have been significantly reduced, and UX features like persistent settings and multi-file uploads have been added.

Migration Steps

To update Unsloth Studio, use the command: `unsloth studio update`.
If you are on Windows, please reinstall to ensure seamless CPU or GPU operation.
Users can now use a new one-line install command: `curl -fsSL https://unsloth.ai/install.sh | sh`.

✨ New Features

Tool calling improved with better llama.cpp parsing, removal of raw tool markup in chat, faster inference, a new Tool Outputs panel, and timers.
Unsloth Studio now works seamlessly on Windows (CPU or GPU).
App shortcuts added for launching Unsloth Studio on Windows, MacOS, and Linux.
Pre-compiled llama.cpp binaries and mamba_ssm are now included, resulting in 6x faster installs and smaller binary sizes (<300MB).
Installation sizes reduced by 50% (7GB+ savings), leading to 2x faster installs and faster dependency resolving.
Colab support with free T4 GPUs is fixed and now 20x faster due to pre-compiled binaries.
Ability to properly use old GGUFs from Hugging Face or LM Studio.
Data Recipes now enabled on MacOS and CPU with multi-file uploading support.
Preliminary AMD support added for Linux machines (auto-detects).
Settings sidebar redesigned, grouping settings into Model, Sampling, Tools, and Preferences.
Context length is now adjustable (though llama.cpp smartly uses exact context via --fit on).
Custom system prompts and chat presets now persist across reloads and page changes.
Data recipes support multi-file drag-and-drop uploads (PDF, DOCX, TXT, MD) with backend extraction, saved uploads, and improved previews.
Improved chat observability: Studio now shows llama-server timings/usage, a context-window usage bar, and richer source hover cards.
Better UX overall, including clickable links, better LaTeX parsing, and tool/code/web tooltips for default cards.
Training history persistence and a past runs viewer added to Studio.
Implementation of Q-GaLore optimizer and custom embedding learning rate.
Diagnostic helper get_tokenizer_info() added.
ROCm (AMD GPU) support added to studio setup.

🐛 Bug Fixes

Fixed silent Windows exits, Anaconda/conda-forge startup crashes, broken non-NVIDIA Windows installs, and missing early CUDA/stale-venv setup checks.
System prompts now work again for non-GGUF text and vision inference.
GGUF export now supports full fine-tunes (not just LoRA/PEFT), base model resolution is more reliable, and unsupported export options are disabled in the UI.
Fixed scroll-position issues during chat generation, thinking-panel layout shift, and viewport jumps when collapsing reasoning panels.
Studio now detects loopback port conflicts, identifies the blocking process when possible, and provides clearer fallback-port messages.
Fixed UnicodeEncodeError on Windows CP1252 consoles during studio setup.
Removed auto wandb.finish() after train() to allow post-training evaluate().
Training arguments now correctly store embedding_learning_rate on self in UnslothTrainingArguments.
System prompt and preset settings now persist across navigation.
Chat tool icons are always shown.
System prompt was ignored in unsloth inference; this is fixed.
Handled prompt/completion datasets in slow-path BOS detection.
Removed quarantined litellm dependency as a precaution (Unsloth Studio was not affected by the compromise).

Affected Symbols

llama.cpp llama-server UnslothTrainingArguments wandb.finish()causal-conv1d litellm