Llama 4 Maverick 17B 128E

flagship

Meta · released 2025-04-01 · text

currently routing · 4.2k rpm

1000M tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

Llama 4 Maverick 17B 128E is Meta's Mixture-of-Experts model from the Llama 4 family, featuring 128 expert modules with 17 billion active parameters per token. This MoE architecture enables high-quality outputs with efficient inference, as only a subset of parameters are activated for each input.

The model delivers strong performance on reasoning, coding, multilingual, and multimodal tasks, representing a significant capability jump from Llama 3. The 128-expert design allows the model to specialize different experts for different types of tasks, improving quality without proportionally increasing inference cost.

Llama 4 Maverick is designed for developers and enterprises seeking frontier-level model quality with reasonable computational requirements, optimized for production deployment.

BENCHMARKS Artificial Analysis Index

Intelligence 18