Change8

v3.3.1

📦 tgiView on GitHub →
2 features🐛 2 fixes🔧 3 symbols

Summary

This release updates TGI to Torch 2.7 and CUDA 12.8, incorporating HPU warmup logic refinements, kernel updates, and bug fixes.

Migration Steps

  1. Update to Torch 2.7.0 and Synapse AI 1.21.0 for optimal performance and compatibility.

✨ New Features

  • Enable Llama4 for gaudi backend
  • Switch to punica-sgmv kernel from the Hub

🐛 Bug Fixes

  • Fix crash in default ATTENTION path
  • Correctly count GPU UUIDs when NVIDIA_VISIBLE_DEVICES env is set to all

🔧 Affected Symbols

HPU warmup logicround_up_seq logicdefault ATTENTION path