v4.52.4-VJEPA-2-preview
📦 transformers
✨ 3 features🔧 4 symbols
Summary
This release introduces a preview of the V-JEPA 2 model, a state-of-the-art self-supervised video encoder for motion understanding and robot manipulation tasks.
Migration Steps
- Install the preview version using: pip install git+https://github.com/huggingface/transformers@v4.52.4-VJEPA-2-preview
✨ New Features
- Added support for V-JEPA 2 (Video Joint-Embedding Predictive Architecture), a self-supervised video encoder by Meta FAIR.
- Support for V-JEPA 2-AC, a latent action-conditioned world model for robot manipulation tasks.
- Integration with AutoModel and AutoVideoProcessor for video classification and retrieval tasks.
🔧 Affected Symbols
VJEPA2ModelAutoModelAutoVideoProcessormodel.get_vision_features