Skip to content
Cohere

Cohere Rerank v4 Fast

flagship
Cohere · released 2025-06-01 · text
currently routing · 4.2k rpm
4.1M tokens
Context
— / 1M
Input
— / 1M
Output
— t/s
Speed
proprietary
License
/ ABOUT

Cohere Rerank v4 Fast is a speed-optimized reranking model that delivers strong relevance scoring with minimal latency. Designed for real-time search applications, it provides faster inference than the Pro variant while maintaining good ranking quality for most use cases.

The model is suitable for high-throughput search systems, autocomplete suggestions, and any application where sub-second reranking is required. It supports English and major world languages with balanced performance across the board.

Rerank v4 Fast is recommended for latency-sensitive applications where speed is prioritized over maximum ranking precision.

Providers for Cohere Rerank v4 Fast

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
bf16
0.00%