v0.37.0

📅 Mar 5, 2026📦 diffusersView on GitHub →

✨ 23 features🐛 21 fixes🔧 11 symbols

Summary

This release introduces Modular Diffusers for flexible pipeline building and adds numerous new image and video generation pipelines, alongside significant core library improvements in caching and context parallelism.

Migration Steps

When using MT5Tokenizer, switch to using `T5Tokenizer` for compatibility with Transformers v5.0+.

✨ New Features

Introduction of Modular Diffusers, allowing pipeline construction via reusable blocks.
New image pipeline: Z Image Omni Base, a full-capacity, undistilled transformer for high-quality generation.
New image pipeline: Flux2 Klein, unifying generation and editing with fast inference on consumer hardware.
New image pipeline: Qwen Image Layered, capable of decomposing images into independent RGBA layers for editability.
New image pipeline: FIBO Edit, an 8B parameter image-to-image model using JSON inputs for structured control.
New image pipeline: Cosmos Predict2.5, specialized for simulating and predicting future world states.
New image pipeline: Cosmos Transfer2.5, a conditional world generation model with adaptive multimodal control.
New image pipeline: GLM-Image, using a hybrid autoregressive + diffusion decoder architecture for high visual fidelity.
New model: Representation Autoencoders (RAE) as an alternative to traditional VAEs.
New video/audio pipeline: LTX-2, an audio-conditioned text-to-video generation model supporting synced audio.
New video pipeline: Helios, a 14B parameter video generation model supporting minute-scale generation.
Introduction of MagCache for improved caching.
Introduction of TaylorSeer for improved caching.
Introduction of Unified Sequence Parallel attention as a new context-parallelism (CP) backend.
Introduction of Ulysses Anything Attention as a new context-parallelism (CP) backend.
New Mambo-G Guidance implementation.
New Laplace Scheduler for DDPM.
Support for Custom Sigmas in UniPCMultistepScheduler.
Support for MultiControlNet in SD3 Inpainting.
Support for Context parallel in native flash attention.
Support for NPU Ulysses Attention.
Introduction of @apply_lora_scale decorator for simplifying model definitions.
Introduction of pipeline-level “pu” device_map.

🐛 Bug Fixes

Fix QwenImageEditPlus on NPU.
Fix MT5Tokenizer by using T5Tokenizer for Transformers v5.0+ compatibility.
Fix Wan/WanI2V patchification.
Fix LTX-2 inference with num_videos_per_prompt > 1 and CFG.
Fix Flux2 img2img prediction.
Fix QwenImage txt_seq_lens handling.
Fix prefix_token_len bug.
Fix ftfy imports in Wan and SkyReels-V2.
Fix is_fsdp determination.
Fix GLM-Image get_image_features API.
Fix Wan 2.2 when either transformer isn't present.
Fix guider issue.
Fix torchao quantizer for new versions.
Fix GGUF for unquantized types with unquantize kernels.
Make Qwen hidden states contiguous for torchao.
Make Flux hidden states contiguous.
Fix Kandinsky 5 hardcoded CUDA autocast.
Fix aiter availability check.
Fix attention mask check for unsupported backends.
Fix Wan 2.1 I2V Context Parallel Inference.
Fix Qwen-Image Context Parallel Inference.

Affected Symbols

MT5Tokenizer T5Tokenizer QwenImageEditPlus Wan SkyReels-V2 LTX-2 Flux2 QwenImage GLM-Image Kandinsky 5 GlmImagePipeline