Skip to content
Google

Gemma 4 26B A4B IT

flagship
Google · released 2026-03-11 · text+image+video->text
currently routing · 4.2k rpm
262K tokens
Context
$0.06 / 1M
Input
$0.33 / 1M
Output
— t/s
Speed
open
License
/ ABOUT

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

BENCHMARKS Artificial Analysis Index
Intelligence 31.2
Coding 22.4
Agentic 32.1

Providers for Gemma 4 26B A4B IT

3 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
262K
bf16
0.00%
262K
bf16
0.00%
262K
bf16
0.00%