v4.51.3-Janus-preview
📦 transformersView on GitHub →
✨ 5 features🔧 2 symbols
Summary
This release introduces a preview of the Janus and Janus-Pro models, a unified multimodal framework capable of both visual understanding and text-to-image generation by decoupling visual encoding pathways.
Migration Steps
- Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.51.3-Janus-preview
- Use processor.apply_chat_template() to format prompts correctly for the Janus chat interface.
- Specify the 'generation_mode' parameter (either 'text' or 'image') when calling the processor and model.generate() as the model does not support interleaved generation.
✨ New Features
- Added support for the Janus model, a unified multimodal framework for understanding and generation.
- Support for Janus-Pro, featuring optimized training and scaled model sizes.
- Ability to perform visual understanding with single and multiple image inputs.
- Support for text-to-image generation within the same architecture.
- Decoupled visual encoding pathways for improved performance in multimodal tasks.
🔧 Affected Symbols
JanusForConditionalGenerationJanusProcessor