Single portrait photo + speech audio = hyper-realistic talking face video with precise lip-audio sync, lifelike facial behavior, and naturalistic head movements, generated in real time.
VASA-1: Lifelike Audio-Driven Talking Faces
Submitted 2 months ago by fart_pickle@lemmy.world to technology@lemmy.world
https://www.microsoft.com/en-us/research/project/vasa-1/