v0.15.0
📦 peftView on GitHub →
✨ 6 features🐛 12 fixes⚡ 1 deprecations🔧 8 symbols
Summary
This release introduces significant new features including CorDA initialization for LoRA and the Trainable Tokens tuner, alongside enhancements to LoRA targeting and Hotswapping capabilities. It also deprecates PEFT_TYPE_TO_MODEL_MAPPING and replaces AutoGPTQ support with GPTQModel.
Migration Steps
- If relying on PEFT_TYPE_TO_MODEL_MAPPING, update code to use PEFT_TYPE_TO_TUNER_MAPPING instead.
- If using LoRA with multihead attention modules, ensure they meet the condition (_qkv_same_embed_dim=True) for now.
- If using Hotswapping with dynamic rank/alpha changes, call prepare_model_for_compiled_hotswap() before compiling the model.
- If using AutoGPTQ, migrate to using GPTQModel.
- To use pattern matching for rank_pattern/alpha_pattern to target full paths, prefix the pattern with a caret (^), e.g., '^foo'.
✨ New Features
- Introduced CorDA (Context-Oriented Decomposition Adaptation) initialization method for LoRA, supporting knowledge-preservation and instruction-preservation modes.
- Added Trainable Tokens tuner for selective training of tokens, offering memory efficiency and smaller checkpoints, usable standalone or with LoRA.
- LoRA now supports targeting multihead attention modules (only those with _qkv_same_embed_dim=True).
- Hotswapping now supports different alpha scalings and ranks without recompilation if the model is prepared via prepare_model_for_compiled_hotswap().
- Added support for GPTQModel as a replacement for the unmaintained AutoGPTQ.
- The 'all-linear' option for target_modules now works for custom (non-transformers) models.
🐛 Bug Fixes
- Fixed a bug where non-linear layers could be selected when using 'all-linear' if they shared a name substring with a linear layer.
- Fixed an issue where modules_to_save keys could wrongly match parts of the state dict if the key was a substring of another key (e.g., 'classifier' matching 'classifier2').
- Fixed device compatibility issues for BOFT forward/merging.
- Added warning for adapter_name conflict with tuner.
- Fixed adoption prompt errors following changes in transformers #35235.
- Fixed low_cpu_mem_usage=True compatibility with 8bit bitsandbytes.
- Fixed memory consumption for CorDA and improved related documentation.
- Fixed generating with mixed adapter batches and beam search enabled.
- Avoided needless copy from modules_to_save.
- Fixed Prefix tuning tests with rotary embedding on multi-GPU.
- Fixed package checks for torchao and EETQ.
- Fixed missing attributes in MultiheadAttention.
🔧 Affected Symbols
LoraConfigprepare_model_for_compiled_hotswapGPTQModelAutoGPTQPEFT_TYPE_TO_MODEL_MAPPINGPEFT_TYPE_TO_TUNER_MAPPINGmodules_to_saveMultiheadAttention⚡ Deprecations
- PEFT_TYPE_TO_MODEL_MAPPING is deprecated and should be replaced by PEFT_TYPE_TO_TUNER_MAPPING.