Following up on the other comment.
The issue is that widely available speech models are not yet offering the quality that is technically possible. That is probably why you think we’re not there yet. But we are.
Oh, I’m looking forward to just translate a whole audiobook into my native language and any speaking style I like.
Okay, perhaps we would still have difficulties with made up fantasy words or words from foreign languages with little training data.
Mind, this is already possible. It’s just that I don’t have access to this technology. I sincerely hope that there will be no gatekeeping to the training data, such that we can train such models ourselves.
danielbln@lemmy.world 1 year ago
Imho it has already been worked out. There is probably selection bias at play as you don’t even recognize the AI voices that are already there.