Skip to content
Meta

Llama 4 Scout 17B 16E

flagship
Meta · released 2025-04-01 · text
currently routing · 4.2k rpm
1000M tokens
Context
— / 1M
Input
— / 1M
Output
— t/s
Speed
open
License
/ ABOUT

Llama 4 Scout 17B 16E is a Mixture-of-Experts model from Meta's Llama 4 family, using 16 expert modules with 17 billion active parameters. The Scout variant is the more efficient option compared to Maverick, with fewer experts enabling faster inference while maintaining strong quality.

The model supports multimodal inputs and excels at text generation, reasoning, and coding tasks. With a million-token context window, it can process extremely long documents, codebases, and multi-turn conversations. The 16-expert MoE design provides a good balance between quality and computational efficiency.

Llama 4 Scout is recommended for most production use cases in the Llama 4 family, offering an excellent quality-to-cost ratio for enterprise and developer applications.

BENCHMARKS Artificial Analysis Index
Intelligence 14

Providers for Llama 4 Scout 17B 16E

7 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%
bf16
0.00%