v5.0.0
Breaking Changes📦 transformersView on GitHub →
⚠ 2 breaking✨ 4 features🐛 1 fixes🔧 11 symbols
Summary
Transformers v5 is the first major release in five years, introducing significant API refactors like dynamic weight loading via WeightConverter and simplifying tokenizer architecture by consolidating slow/fast implementations. The release cadence is shifting to weekly minor updates.
⚠️ Breaking Changes
- The separation between "slow" (Python-based) and "fast" (Rust-based) tokenizers has been removed. Tokenizer implementations are now consolidated into a single file per model, using the most appropriate backend (TokenizersBackend preferred). Users relying on specific internal structures of the old slow/fast implementations might need to adjust.
- Significant API changes related to weight loading have been introduced, centered around the new WeightConverter class, replacing previous methods for checkpoint manipulation.
Migration Steps
- Check the continuously updated migration guide available at `https://github.com/huggingface/transformers/blob/main/MIGRATION_GUIDE_V5.md` if facing issues after upgrading.
- Review code that previously relied on separate slow/fast tokenizer implementations, as they are now consolidated under a single structure.
✨ New Features
- Introduction of a new dynamic weight loading API centered around the `WeightConverter` class, enabling operations like reshaping, merging, and splitting layers during checkpoint loading (useful for quantization/parallelism).
- Refactored tokenizer definition to be more intuitive, allowing initialization of empty, trainable tokenizers directly from class definitions (e.g., initializing an empty `LlamaTokenizer` and training it).
- Consolidation of tokenizer backends into a single file per model, preferring the Rust-based `TokenizersBackend` for performance and features, while supporting `SentencePieceBackend`, `PythonBackend`, and `MistralCommonBackend`.
- New release cadence: minor releases (e.g., v5.1, v5.2) will now occur weekly instead of every five weeks.
🐛 Bug Fixes
- A large number of bug fixes and improvements were included in this major release.