v3.3.3
📦 tgiView on GitHub →
✨ 1 features🐛 4 fixes🔧 1 symbols
Summary
This release focuses on updating the Neuron backend, including bumping the SDK version and adding support for the Qwen3_moe model on Gaudi. Several Gaudi-specific fixes and performance optimizations were also implemented.
✨ New Features
- [Gaudi] Enable Qwen3_moe model support.
🐛 Bug Fixes
- [gaudi] Fixed an issue in benchmark tests related to Vlm rebase.
- [Gaudi] Fixed an issue specific to the HuggingFaceM4/idefics2-8b model.
- [Gaudi] Fixed integration test issues.
- Used pad_token_id to pad input id for Gaudi.
🔧 Affected Symbols
get_cos_sin