DeepSeek-V3-0324

flagship

DeepSeek · released 2024-03-24 · text

currently routing · 4.2k rpm

128M tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

DeepSeek-V3-0324 is the March 2024 snapshot of DeepSeek's V3 open-weight language model, a Mixture-of-Experts (MoE) architecture with 671B total parameters (37B active per token). It represents one of the most capable open models available, delivering performance competitive with leading proprietary models on benchmarks.

The model uses a Multi-head Latent Attention (MLA) architecture and DeepSeekMoE for efficient inference, activating only a fraction of its parameters per forward pass. This enables high-quality outputs with reasonable computational costs. It excels at coding, mathematics, reasoning, and multilingual tasks.

DeepSeek-V3-0324 is recommended for developers seeking top-tier open model performance, particularly for applications requiring strong reasoning and coding capabilities.

BENCHMARKS Artificial Analysis Index

Intelligence 32