v0.36.0
📦 diffusersView on GitHub →
✨ 13 features🐛 6 fixes🔧 10 symbols
Summary
This release introduces several major image and video pipelines (Flux2, Hunyuan 1.5, Sana-Video), high-performance attention backends via the 'kernels' library, and the TaylorSeer caching method for significant speed improvements.
Migration Steps
- Install the 'kernels' library via 'pip install kernels' to use new attention backends.
- Update attention backends using 'pipe.transformer.set_attention_backend("_flash_3_hub")' or similar.
- Review the new modality-organized documentation to locate updated pipeline paths.
✨ New Features
- Added Flux2 image generation and editing pipeline supporting multiple input images.
- Added Z-Image 6B parameter image generation model.
- Added QwenImage Edit Plus with multi-image reference capabilities.
- Added Bria FIBO for precise control using structured JSON captions.
- Added Kandinsky 5.0 Image Lite (6B) and Video Lite (2B) models.
- Added ChronoEdit for image editing via temporal video reasoning.
- Added Sana-Video with linear attention for long video sequences.
- Added HunyuanVideo-1.5 (8.3B) for high-quality motion coherence.
- Added Wan-Animate for character animation and replacement.
- Introduced 'kernels'-powered attention backends for Flash Attention 2/3 and SAGE.
- Integrated TaylorSeer cache for up to 3x speedups.
- New LoRA fine-tuning script for Flux.2 with consumer GPU optimizations.
- Introduced AutoencoderMixin and AttentionMixin to streamline model codebases.
🐛 Bug Fixes
- Fixed clapconfig for text backbone in audioldm2 tests.
- Removed redundant RoPE Cache in Qwen Image.
- Changed missing imports for custom code from errors to warnings.
- Fixed incorrect temporary variable key when replacing adapter names.
- Fixed Kandinsky5 No CFG issue.
- Added _skip_keys for AutoencoderKLWan.
🔧 Affected Symbols
AttentionMixinAutoencoderMixinAutoencoderKLWanFlux2PipelineSanaVideoPipelineKandinsky5PipelineHunyuanVideoPipelineChronoEditPipelineVAETesterMixinset_attention_backend