T5Gemma
A collection of encoder-decoder models that provide a strong quality-inference efficiency tradeoff
Download T5Gemma
T5Gemma adapts pretrained decoder-only Gemma 2 models into an encoder-decoder architecture. These models are trained with either PrefixLM for strong generative performance or UL2 for high-quality contextual representations.
Capabilities
- 
        
Enhanced reasoning Dedicated encoder significantly boosts performance on tasks requiring deep context comprehension, such as math reasoning (GSM8K). 
- 
        
Flexible architecture Model adaptation techniques allows for flexible configurations, including "unbalanced" models where the encoder and decoder have different sizes. 
- 
        
High efficiency Superior quality-to-efficiency ratio without extensive compute requirements. 
Model variants
- 
        
Gemma 2 sizes Checkpoints based on the official Gemma 2 2B and 9B models, as well as the “unbalanced” 9B-2B checkpoint. 
- 
        
T5 sizes Small, Base, Large, and XL sizes following the T5 configuration, plus an additional model sized between T5 Large and T5 X.