v4.52.4-VJEPA-2-preview

📅 Jun 11, 2025📦 transformersView on GitHub →

✨ 3 features🔧 4 symbols

Summary

This release introduces a preview of the V-JEPA 2 model, a state-of-the-art self-supervised video encoder for motion understanding and robot manipulation tasks.

Migration Steps

Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.52.4-VJEPA-2-preview

✨ New Features

Added support for V-JEPA 2 (Video Joint-Embedding Predictive Architecture), a self-supervised video encoder by Meta FAIR.
Support for V-JEPA 2-AC, a latent action-conditioned world model for robot manipulation tasks.
Integration with AutoModel and AutoVideoProcessor for video classification and retrieval tasks.

Affected Symbols

VJEPA2Model AutoModel AutoVideoProcessor model.get_vision_features