Change8

v5.0.0rc3

Breaking Changes
📦 transformersView on GitHub →
3 breaking10 features🐛 17 fixes2 deprecations🔧 29 symbols

Summary

This release candidate (v5.0.0rc3) introduces several new models including GLM-Lite and LWDetr, while aggressively removing deprecated classes and fixing numerous integration tests and minor bugs.

⚠️ Breaking Changes

  • Removed all deprecated classes from cache module. Code relying on these classes will need updating.
  • Removed more deprecated objects/arguments. Review usage of older APIs.
  • Removed deprecated and unused `position_ids` argument in all `apply_rotary_pos_emb` calls. This argument is no longer accepted.

Migration Steps

  1. If you were using deprecated classes from the cache module, update your code to use current APIs.
  2. If you were relying on the `position_ids` argument in `apply_rotary_pos_emb`, remove it from your calls.

✨ New Features

  • Added support for GLM-Lite model ([GLM-4.7]).
  • Added AR Model Support for GLM-Image ([GLM-Image]).
  • Added LWDetr model implementation.
  • Added LightOnOCR model implementation.
  • Added support for MiniMax-M2 model.
  • Generation config now supports boolean defaults.
  • Generation config validation has been improved.
  • Grouped beam search can now be configured via config parameters.
  • Allow custom config values in generate config.
  • Qwen-VL video processor now accepts min/max pixels.

🐛 Bug Fixes

  • Capped generation length to be less than max_position_embedding in DiT for qwen2_5_omni.
  • Fixed Fuyu processor width dimension bug in `_get_num_multimodal_tokens`.
  • Fixed failing `BartModelIntegrationTest`.
  • Fixed failure of llava/pixtral generation.
  • Fixed experts handling in [`Fp8`].
  • Fixed failing `BitModelIntegrationTest`.
  • Fixed chunked prefill implementation issue-43082.
  • Fixed failing `salesforce-ctrl`, `xlm` & `gpt-neo` model generation tests.
  • Clamped temperature to be >=1.0 for Dia generation.
  • Fixed failing `Pix2StructIntegrationTest`.
  • Fixed missing UTF-8 encoding in check_repo.py for Windows compatibility.
  • Fixed failing `PhiIntegrationTests`.
  • Fixed failing `Owlv2ModelIntegrationTest` & `OwlViTModelIntegrationTest`.
  • Fixed flashattn compatibility with quantized models.
  • Fixed unsafe torch.load() in `_load_rng_state` which allowed arbitrary code execution.
  • Fixed benchmark script.
  • Fixed failing `Vip-llava` model integration test.

Affected Symbols

⚡ Deprecations

  • Deprecate dtype per sub config.
  • Remove redundant whitespace pre-tokenizer in GemmaTokenizer.