Skip to content
Google

Gemini 3.1 Flash-Lite

flagship
Google · released 2026-01-01 · text
currently routing · 4.2k rpm
128M tokens
Context
— / 1M
Input
— / 1M
Output
— t/s
Speed
proprietary
License
/ ABOUT

Google Gemini 3.1 Flash-Lite is the ultra-lightweight variant of Google's Gemini 3.1 multimodal model family, optimized for minimum latency and cost. It provides capable text and multimodal processing at a fraction of the cost of Flash and Pro models, making it ideal for high-volume, cost-sensitive applications.

Despite its efficiency focus, Flash-Lite handles a range of tasks including text generation, summarization, classification, and basic visual understanding. It supports multiple input modalities and maintains reasonable quality on standard benchmarks.

Gemini 3.1 Flash-Lite is best suited for applications like real-time chat, content moderation, bulk classification, and edge deployment where ultra-low latency and minimal cost are essential.

BENCHMARKS Artificial Analysis Index
Intelligence 33.5

Providers for Gemini 3.1 Flash-Lite

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
bf16
0.00%