v0.1.39-beta
📦 unslothView on GitHub →
✨ 10 features🐛 24 fixes🔧 4 symbols
Summary
This release introduces powerful local LLM integration via a self-hosted API endpoint supporting advanced tooling, code execution, and web search. Numerous bug fixes were applied across the Studio UI, training stability (DPO hangs), and installation scripts.
Migration Steps
- Update using `2026.5.2` or directly call `curl -fsSL https://unsloth.ai/install.sh | sh` or `unsloth studio update` to resolve chat history/attachment bugs.
- Patch checkpoint reload init functions to strip unsupported arguments if resuming training.
- If using Studio, note that the default host is now 127.0.0.1 and it prompts before auto-start.
- Use `unsloth studio run --forward-args` or similar mechanisms to pass llama-server arguments if customizing server startup.
- Use `model_name:quantization_type` syntax when loading models via Studio run commands.
✨ New Features
- Local LLMs (like Claude Code, Codex, Qwen, Gemma) can now be run via Unsloth's API endpoint, enabling features like self-healing tool calling, code execution (Bash/Python), and advanced web search.
- Unsloth API endpoint exposes models via `llama-server` speaking Anthropic-compatible `/v1/messages` and OpenAI-compatible `/v1/chat/completions` and `/v1/responses` dialects.
- Added support for new models: NVIDIA Nemotron 3 Nano Omni, IBM Granite 4.1, and Mistral 3.5 Medium.
- Added support for Qwen3.6.
- Studio: Added dataset upload dropzone.
- Studio: Enabled deleting fine-tuned chat models.
- Studio: Added checkpoint resume functionality for stopped training runs.
- Studio: Default host set to 127.0.0.1 and prompts before auto-start.
- Studio: Added ability to forward llama-server arguments from `unsloth studio run` and allow passing model:quant to load models.
- unsloth run: Added --enable-tools/--disable-tools server-side tool policy.
🐛 Bug Fixes
- Fixed chat history not being shown (existing history is preserved).
- Fixed attachments not attaching correctly (render-only bug).
- Stopped Studio training runs can now resume from checkpoints.
- Chat threads now autosave and persist more reliably.
- Fixed DPO training hangs in multi-process setups.
- Improved VLM GRPO support with MROPE updates.
- Studio's stop button now properly stops generation.
- Fixed chat template disappearing after browser refresh.
- Fixed issue where Studio used max seq length instead of (gguf) context length.
- Fixed typo cleanup across tests and backend strings.
- Guarded resolve_model_class fallback against unresolvable transformers AutoModel entries.
- Studio: Kills in-flight llama-server before spawning a new one.
- Studio: Stopped currency escape from breaking inline LaTeX.
- Studio: Probed AMD GPUs in llama-server VRAM detection.
- Fixed issue where mmproj F16 variant selection used incorrect logic.
- Fixed Windows install issues when paths contain spaces or Python 3.14 is on PATH.
- Studio: Preserved transparency in uploaded profile avatars.
- Fixed UX issue with single chat header error placement and selector alignment.
- Studio: Fixed clipped model selector text descenders.
- Fixed DPO trainer multi-process hang.
- Fixed local model scanner handling of ollama cloud models.
- Fixed Studio desktop tray installer and titlebar issues.
- Fixed check for libcurl headers in install.sh.
- Fixed image-only chat requests failing validation.