v4.51.1

📅 Apr 8, 2025📦 transformersView on GitHub →

🐛 8 fixes🔧 6 symbols

Summary

This patch release focuses on stabilizing Llama 4 support and fixing compatibility issues with torch 2.6.0, DeepSpeed, and weight initialization.

Update your library to version v4.51.1 using your package manager (e.g., pip install --upgrade llama-package).
If you were experiencing issues related to flex attention with torch=2.6.0, these should now be resolved.
If you were using HQQ with the caching allocator warmup, note that HQQ has been removed from this specific warmup process; review your initialization logic if this affects performance or startup time.
If you encountered issues initializing weights when not using the accelerate library, these fixes should resolve them.

flex_attentionLlama4BertPreTrainedModel._init_weightsinit_empty_weightsDeepSpeedEngineHQQ