Skip to content
Meta

Llama 4 Maverick 17B 128E

flagship
Meta · released 2025-04-01 · text
currently routing · 4.2k rpm
1000M tokens
Context
— / 1M
Input
— / 1M
Output
— t/s
Speed
open
License
/ ABOUT

Llama 4 Maverick 17B 128E is Meta's Mixture-of-Experts model from the Llama 4 family, featuring 128 expert modules with 17 billion active parameters per token. This MoE architecture enables high-quality outputs with efficient inference, as only a subset of parameters are activated for each input.

The model delivers strong performance on reasoning, coding, multilingual, and multimodal tasks, representing a significant capability jump from Llama 3. The 128-expert design allows the model to specialize different experts for different types of tasks, improving quality without proportionally increasing inference cost.

Llama 4 Maverick is designed for developers and enterprises seeking frontier-level model quality with reasonable computational requirements, optimized for production deployment.

BENCHMARKS Artificial Analysis Index
Intelligence 18

Providers for Llama 4 Maverick 17B 128E

6 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%