v3.2.2
📦 tgiView on GitHub →
✨ 3 features🐛 1 fixes
Summary
This release introduces support for the llama4 model, adds a configurable termination timeout, and includes several fixes, notably for Gaudi hardware.
✨ New Features
- Added support for the llama4 model.
- Introduced configurable termination timeout.
- Gaudi backend now uses exponential growth to replace BATCH_BUCKET_SIZE.
🐛 Bug Fixes
- Fixed crash issue for llava-next and mllama models on Gaudi hardware.