LipSync

flagship

NVIDIA · released 2024-09-01 · text

currently routing · 4.2k rpm

1K tokens

Context

— / 1M

Input

— / 1M

Output

— t/s

Speed

open

License

/ ABOUT

NVIDIA LipSync is a specialized model for generating realistic lip movements synchronized with audio input. Given an audio track and a face image or video, it produces natural-looking lip animations that match the spoken words, enabling applications in dubbing, virtual avatars, and content creation.

The model handles diverse speaking styles, languages, and emotional expressions, producing lip movements that accurately reflect phoneme sequences while maintaining natural facial dynamics. It preserves the speaker's identity and facial characteristics while modifying only the mouth region.

LipSync is valuable for video localization, virtual presenter creation, game character animation, and accessibility tools for hearing-impaired viewers.

Providers for LipSync

1 routes · sorted by uptime

ClosedRouter routes requests to the providers best able to handle your prompt size and parameters, with automatic fallbacks to maximize uptime.

Provider

Context

Quant

Uptime · 30d

NVIDIA NIM

—

bf16

0.00%