Skip to content
NVIDIA

LipSync

flagship
NVIDIA · released 2024-09-01 · text
currently routing · 4.2k rpm
1K tokens
Context
— / 1M
Input
— / 1M
Output
— t/s
Speed
open
License
/ ABOUT

NVIDIA LipSync is a specialized model for generating realistic lip movements synchronized with audio input. Given an audio track and a face image or video, it produces natural-looking lip animations that match the spoken words, enabling applications in dubbing, virtual avatars, and content creation.

The model handles diverse speaking styles, languages, and emotional expressions, producing lip movements that accurately reflect phoneme sequences while maintaining natural facial dynamics. It preserves the speaker's identity and facial characteristics while modifying only the mouth region.

LipSync is valuable for video localization, virtual presenter creation, game character animation, and accessibility tools for hearing-impaired viewers.

Providers for LipSync

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider
Context
Quant
Uptime · 30d
bf16
0.00%