v0.35.0
📦 diffusersView on GitHub →
✨ 12 features🐛 6 fixes⚡ 2 deprecations🔧 8 symbols
Summary
This release introduces major new pipelines (Wan 2.2, Flux-Kontext, Qwen-Image), significant performance optimizations via regional compilation and GGUF CUDA kernels, and an experimental modular pipeline system.
Migration Steps
- To speed up loading, replace .to("cuda") with device_map="cuda" in DiffusionPipeline.from_pretrained().
- Enable parallel loading by setting os.environ["HF_ENABLE_PARALLEL_LOADING"] = "yes" before loading large models.
- For performance optimization, implement regional compilation by following the updated optimization guides.
✨ New Features
- Added Wan 2.2 video generation pipeline with improved fidelity and prompt adherence.
- Added Flux-Kontext image editing pipeline (12B parameter rectified flow transformer).
- Added Qwen-Image and Qwen-Image-Edit pipelines (Apache-2.0 licensed).
- Introduced Regional Compilation to reduce cold-start latency and compile time by 8–10x.
- Added support for loading pipelines directly to accelerator devices using device_map.
- Enabled parallelized loading of state dict shards via HF_ENABLE_PARALLEL_LOADING environment variable.
- Native GGUF CUDA kernels support for ~10% inference speed improvement.
- Support for loading Diffusers format GGUF checkpoints and a conversion tool.
- Experimental 'Modular Diffusers' system for building pipelines with individual blocks.
- Massive attention refactor to support multiple backends (SDPA, Flash Attention 3, SAGE).
- New training scripts for Kontext and Qwen-Image.
- Single-file modeling implementation for Flux Transformer and Cosmos.
🐛 Bug Fixes
- Fixed LoRA unloading behavior.
- Removed unnecessary synchronization before denoising in Kontext.
- Fixed failing float16 CUDA tests.
- Adjusted tolerance criteria for float16 inference unit tests on XPU.
- Removed print statement in SCM Scheduler.
- Fixed single_file documentation examples.
🔧 Affected Symbols
DiffusionPipelineFluxTransformer2DModelWanPipelineFluxKontextPipelineQwenImagePipelineQwenImageEditPipelinetorch.compilescaled_dot_product_attention⚡ Deprecations
- Deprecated pipelines documentation updated (refer to docs for specific list).
- LoRA deprecation fixes following the 0.34.0 release.