v0.33.0

📅 Apr 9, 2025📦 diffusersView on GitHub →

✨ 10 features🔧 14 symbols

Summary

This release introduces several major video and image generation pipelines including Wan2.1, LTX Video 0.9.5, and Lumina2, alongside significant memory optimization features like Layerwise Casting and Group Offloading.

Migration Steps

To use LTX Video 0.9.5 conditioning, switch to using LTXConditionPipeline and the LTXVideoCondition object.
To enable layerwise casting, call .enable_layerwise_casting(storage_dtype, compute_dtype) on the model transformer.
To use group offloading, call .enable_group_offload(onload_device, offload_device) on supported model components.
If using use_stream=True in group offloading, ensure CPU RAM is at least 2x model size or set low_cpu_mem_usage=True.

✨ New Features

Added support for Wan2.1 video foundation models (T2V, I2V, and V2V variants).
Introduced LTX Video 0.9.5 with keyframe-based animation and video extension support.
Added Hunyuan Image to Video pipeline using MLLM-based text encoding.
Added Sana-Sprint for ultra-fast text-to-image generation via hybrid distillation.
Added Lumina2 2B parameter flow-based diffusion transformer.
Added OmniGen unified model for multi-task image generation and editing.
Added support for CogView4, EasyAnimateV5, and ConsisID.
Introduced Layerwise Casting to store weights in FP8 and upcast on-the-fly to reduce VRAM by up to 50%.
Introduced Group Offloading for optimized memory management between sequential and model offloading.
Added CUDA Stream support for layer prefetching to overlap computation with data transfer.

🔧 Affected Symbols

Wan2.1-T2V-1.3B-DiffusersWan2.1-T2V-14B-DiffusersWan2.1-I2V-14B-480P-DiffusersWan2.1-I2V-14B-720P-DiffusersLTXConditionPipelineLTXVideoConditionSana-SprintLumina2OmniGenCogView4EasyAnimateV5ConsisIDenable_layerwise_castingenable_group_offload