Change8

v3.2.2

📦 tgiView on GitHub →
3 features🐛 1 fixes

Summary

This release introduces support for the llama4 model, adds a configurable termination timeout, and includes several fixes, notably for Gaudi hardware.

✨ New Features

  • Added support for the llama4 model.
  • Introduced configurable termination timeout.
  • Gaudi backend now uses exponential growth to replace BATCH_BUCKET_SIZE.

🐛 Bug Fixes

  • Fixed crash issue for llava-next and mllama models on Gaudi hardware.