v0.15.0

📅 Mar 19, 2025📦 peftView on GitHub →

✨ 6 features🐛 12 fixes⚡ 1 deprecations🔧 8 symbols

Summary

This release introduces significant new features including CorDA initialization for LoRA and the Trainable Tokens tuner, alongside enhancements to LoRA targeting and Hotswapping capabilities. It also deprecates PEFT_TYPE_TO_MODEL_MAPPING and replaces AutoGPTQ support with GPTQModel.

Migration Steps

If relying on PEFT_TYPE_TO_MODEL_MAPPING, update code to use PEFT_TYPE_TO_TUNER_MAPPING instead.
If using LoRA with multihead attention modules, ensure they meet the condition (_qkv_same_embed_dim=True) for now.
If using Hotswapping with dynamic rank/alpha changes, call prepare_model_for_compiled_hotswap() before compiling the model.
If using AutoGPTQ, migrate to using GPTQModel.
To use pattern matching for rank_pattern/alpha_pattern to target full paths, prefix the pattern with a caret (^), e.g., '^foo'.

✨ New Features

Introduced CorDA (Context-Oriented Decomposition Adaptation) initialization method for LoRA, supporting knowledge-preservation and instruction-preservation modes.
Added Trainable Tokens tuner for selective training of tokens, offering memory efficiency and smaller checkpoints, usable standalone or with LoRA.
LoRA now supports targeting multihead attention modules (only those with _qkv_same_embed_dim=True).
Hotswapping now supports different alpha scalings and ranks without recompilation if the model is prepared via prepare_model_for_compiled_hotswap().
Added support for GPTQModel as a replacement for the unmaintained AutoGPTQ.
The 'all-linear' option for target_modules now works for custom (non-transformers) models.

🐛 Bug Fixes

Fixed a bug where non-linear layers could be selected when using 'all-linear' if they shared a name substring with a linear layer.
Fixed an issue where modules_to_save keys could wrongly match parts of the state dict if the key was a substring of another key (e.g., 'classifier' matching 'classifier2').
Fixed device compatibility issues for BOFT forward/merging.
Added warning for adapter_name conflict with tuner.
Fixed adoption prompt errors following changes in transformers #35235.
Fixed low_cpu_mem_usage=True compatibility with 8bit bitsandbytes.
Fixed memory consumption for CorDA and improved related documentation.
Fixed generating with mixed adapter batches and beam search enabled.
Avoided needless copy from modules_to_save.
Fixed Prefix tuning tests with rotary embedding on multi-GPU.
Fixed package checks for torchao and EETQ.
Fixed missing attributes in MultiheadAttention.

🔧 Affected Symbols

LoraConfigprepare_model_for_compiled_hotswapGPTQModelAutoGPTQPEFT_TYPE_TO_MODEL_MAPPINGPEFT_TYPE_TO_TUNER_MAPPINGmodules_to_saveMultiheadAttention

⚡ Deprecations

PEFT_TYPE_TO_MODEL_MAPPING is deprecated and should be replaced by PEFT_TYPE_TO_TUNER_MAPPING.