Whisper Large V3 Turbo

flagship

OpenAI · released 2024-09-01 · text

currently routing · 4.2k rpm

1K tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

proprietary

License

/ ABOUT

Whisper Large V3 Turbo is a speed-optimized variant of Whisper Large V3 that maintains near-equivalent transcription accuracy while significantly reducing inference time. It achieves this through architecture optimizations and distillation techniques that compress the model without major quality loss.

The Turbo variant is ideal for real-time transcription, high-throughput batch processing, and applications where latency matters. It supports the same 50+ languages and features as the full Large V3 model, including word-level timestamps and translation capabilities.

Whisper Large V3 Turbo is recommended for most production transcription workloads where speed is important, offering an excellent quality-to-latency ratio.

Providers for Whisper Large V3 Turbo

2 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider

Context

Quant

Uptime · 30d

Groq Cloud

—

bf16

0.00%

Cloudflare Workers AI

—

bf16

0.00%