Change8

2025-02

📦 unslothView on GitHub →
7 features🐛 9 fixes🔧 7 symbols

Summary

This release introduces major support for GRPO training, enabling LoRA/QLoRA for GRPO across various models, and integrates fast inference via vLLM for significant throughput gains. Numerous bug fixes address issues with Gemma 2, Mistral mapping, and general stability.

Migration Steps

  1. To use the new fast inference feature, install vLLM: `pip install vllm`.
  2. When loading the model, set `fast_inference = True`: `FastLanguageModel.from_pretrained(..., fast_inference = True)`.
  3. Upgrade Unsloth and related dependencies using: `pip install --upgrade --no-cache-dir --force-reinstall unsloth_zoo unsloth vllm`.
  4. For GRPO training examples, ensure `trl` is installed correctly: `pip install git+https://github.com/huggingface/trl.git`.
  5. If using GRPO, patch the model using `PatchFastRL("GRPO", FastLanguageModel)`.

✨ New Features

  • Introduced support for GRPO (Generative Reinforcement Policy Optimization) training.
  • LoRA (16bit) and QLoRA (4bit) are now functional for GRPO.
  • Enabled GRPO training for Phi-4 14B and Llama-3.1 8B on consumer GPUs (e.g., 15GB Colab).
  • Added native fast inference (up to 20x throughput) via vLLM, accessible via `model.fast_generate` after setting `fast_inference = True` during model loading.
  • Support for Llama 3.3 70B QLoRA GRPO training on 1x 48GB/80GB GPUs.
  • Added `use_exact_model_name` option to prevent automatic model name modification.
  • Support for modelscope models and datasets.

🐛 Bug Fixes

  • Fixed issues related to Gemma 2 models.
  • Fixed Mistral base model mapping.
  • Resolved several syntax warning issues.
  • Improved debugging experience.
  • Changed model handling to use `base_model` if a PEFT model is already utilized.
  • Fixed attention refactoring issues.
  • Updated Granite support to work with latest post_patch methods.
  • Applied minor fixes for Granite models.
  • Fixed `flash_attn_detection_error`.

🔧 Affected Symbols

FastLanguageModel.from_pretrainedmodel.fast_generatePatchFastRLGRPOConfigGRPOTrainertrl.GRPOConfigtrl.GRPOTrainer