4.55.0-GLM-4.5V-preview

📅 Aug 11, 2025📦 transformersView on GitHub →

✨ 6 features🔧 2 symbols

Summary

This release introduces GLM-4.5V, a high-performance multimodal reasoning model based on GLM-4.5-Air, featuring advanced capabilities in image, video, and GUI analysis.

Migration Steps

Install the specific release branch using: pip install transformers-v4.55.0-GLM-4.5V-preview

✨ New Features

Integration of GLM-4.5V, a multimodal reasoning model with 106B total and 12B active parameters.
Support for image reasoning including scene understanding and spatial recognition.
Support for video understanding including long video segmentation and event recognition.
Support for GUI tasks such as screen reading and desktop operation assistance.
Support for complex chart and long document parsing.
Support for grounding and precise visual element localization.

Affected Symbols

AutoProcessor Glm4vMoeForConditionalGeneration