v5.5.2

📅 Apr 9, 2026📦 transformersView on GitHub →

✨ 1 features🐛 2 fixes🔧 1 symbols

Summary

This patch optimizes Gemma4 by adding MoE support and fixing inference issues related to k/v state sharing when caching is disabled. It also corrects weight serialization mappings for VLMs.

✨ New Features

Added support for Mixture of Experts (MoE) in Gemma4 Tensor Parallelism (TP) plan.

🐛 Bug Fixes

Fixed inference issue with `use_cache=False` in Gemma4 related to k/v states sharing between layers.
Fixed inconsistent serialization of weight names during conversion mappings for some vision-language models (VLMs).

Affected Symbols

gemma4