Llama 4 Scout 17B 16E

flagship

Meta · released 2025-04-01 · text

currently routing · 4.2k rpm

1000M tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

Llama 4 Scout 17B 16E is a Mixture-of-Experts model from Meta's Llama 4 family, using 16 expert modules with 17 billion active parameters. The Scout variant is the more efficient option compared to Maverick, with fewer experts enabling faster inference while maintaining strong quality.

The model supports multimodal inputs and excels at text generation, reasoning, and coding tasks. With a million-token context window, it can process extremely long documents, codebases, and multi-turn conversations. The 16-expert MoE design provides a good balance between quality and computational efficiency.

Llama 4 Scout is recommended for most production use cases in the Llama 4 family, offering an excellent quality-to-cost ratio for enterprise and developer applications.

BENCHMARKS Artificial Analysis Index

Intelligence 14