Well, if you talk about the newest AI-powered UTAU voicebanks, that’s because the developers finally thought about crossing the streams, and instead of having the singers merely pronounce syllables in several pitches, they used that data (expanded to also include several syllable clusters) to train an AI. Unlike most trained AI models, where the voice samples are recorded from live performances, so they vary in quality and on data points for each individual syllable, these have the full set of voice training data prerecorded by design, so the quality of every possible combination of phonemes is as clear as possible.
Mac@mander.xyz 1 year ago
That’s very interesting. Where can i read more about it?
csolisr@communities.azkware.net 1 year ago
dreamtonics.com/…/synthesizer-v-ai-announcement/