v4.51.3-Janus-preview

📅 Apr 22, 2025📦 transformersView on GitHub →

✨ 5 features🔧 2 symbols

Summary

This release introduces a preview of the Janus and Janus-Pro models, a unified multimodal framework capable of both visual understanding and text-to-image generation by decoupling visual encoding pathways.

Migration Steps

Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.51.3-Janus-preview
Use processor.apply_chat_template() to format prompts correctly for the Janus chat interface.
Specify the 'generation_mode' parameter (either 'text' or 'image') when calling the processor and model.generate() as the model does not support interleaved generation.

✨ New Features

Added support for the Janus model, a unified multimodal framework for understanding and generation.
Support for Janus-Pro, featuring optimized training and scaled model sizes.
Ability to perform visual understanding with single and multiple image inputs.
Support for text-to-image generation within the same architecture.
Decoupled visual encoding pathways for improved performance in multimodal tasks.

Affected Symbols

JanusForConditionalGeneration JanusProcessor