Change8

v4.51.3-Janus-preview

📦 transformersView on GitHub →
5 features🔧 2 symbols

Summary

This release introduces a preview of the Janus and Janus-Pro models, a unified multimodal framework capable of both visual understanding and text-to-image generation by decoupling visual encoding pathways.

Migration Steps

  1. Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.51.3-Janus-preview
  2. Use processor.apply_chat_template() to format prompts correctly for the Janus chat interface.
  3. Specify the 'generation_mode' parameter (either 'text' or 'image') when calling the processor and model.generate() as the model does not support interleaved generation.

✨ New Features

  • Added support for the Janus model, a unified multimodal framework for understanding and generation.
  • Support for Janus-Pro, featuring optimized training and scaled model sizes.
  • Ability to perform visual understanding with single and multiple image inputs.
  • Support for text-to-image generation within the same architecture.
  • Decoupled visual encoding pathways for improved performance in multimodal tasks.

🔧 Affected Symbols

JanusForConditionalGenerationJanusProcessor