Comment on Spotify is going to clone podcasters’ voices — and translate them to other languages

<- View Parent
sudoshakes@reddthat.com ⁨1⁩ ⁨year⁩ ago

The model inferred meaning much the same way it infers meaning from text. Short phrases can generate intricate images accurate to author intent using stable diffusion.

The models themselves in those studies leveraged stable diffusion as the mechanism of image generation, but instead of text prompts, they use fMRI data training.

source
Sort:hotnewtop