A partnership with OpenAI will let podcasters replicate their voices to automatically create foreign-language versions of their shows.
Ah. So now people can listen to Joe Rogan in the original Russian.
Submitted 1 year ago by stopthatgirl7@kbin.social to technology@lemmy.world
https://www.theverge.com/2023/9/25/23888009/spotify-podcast-translation-voice-replication-open-ai
A partnership with OpenAI will let podcasters replicate their voices to automatically create foreign-language versions of their shows.
Ah. So now people can listen to Joe Rogan in the original Russian.
as it was meant to be heard
That’s just weird… Part of the reason I listen to podcasts is that I just enjoy people talking about things and AI voices still have this uncanny quality to me
A large language model took a 3 second snippet of a voice and extrapolated from that the whole spoken English lexicon from that voice in a way that was indistinguishable from the real person to banking voice verification algorithms.
We are so far beyond what you think of when we say the word AI, because we replaced the underlying thing that it is without most people realizing it. The speed of large language models progress at current is mind boggling.
These models when shown FMRI data for a patient, can figure out what image the patient is looking at, and then render it. Patient looks at a picture of a giraffe in a jungle, and the model renders it having never before seen a giraffe… from brain scan data, in real time.
Not good enough? The same FMRI data was examined in real time by a large language model while a patient was watching a short movie and asked to think about what they saw in words. The sentence the person thought, was rendered as English sentences by the model, in real time, looking at fMRI data.
That’s a step from reading dreams and that too will happen inside 20 months.
We, are very much there.
I don’t think what you’re saying is possible. Voxels used in fMRI measure in millimeters (down to one of I recall) and don’t allow for such granular analysis. It is possible to ‘see’ what a person sees but the image doesn’t resemble the original too closely.
At least that’s what I have learned a few years ago. I’m happy to look at new sources, if you have some though.
Interesting and scary to think ai understands the black box of human neurology more than we understand the black box of ai.
Point taken, well done.
This is beautiful
That’s obviously way better than any TTS before it, but I still wouldn’t want to listen to it for more than a few minutes. In these two sentences I can already hear some of the “AI quirks” and the longer you listen, the more you start to notice them.
I listen to a lot of AI celeb impersonations and they all sound like the same machine with different voice synthesizers. There’s something about the prosody that gives it away, every sentence has the same generic pattern.
Humans are generally more creative, or more monotonous, but AI is in a weird inbetween space where it’s never interested and never bored, always soulless.
It won’t take long until that uncanny quality is worked out.
Imho it has already been worked out. There is probably selection bias at play as you don’t even recognize the AI voices that are already there.
Following up on the other comment.
The issue is that widely available speech models are not yet offering the quality that is technically possible. That is probably why you think we’re not there yet. But we are.
Oh, I’m looking forward to just translate a whole audiobook into my native language and any speaking style I like.
Okay, perhaps we would still have difficulties with made up fantasy words or words from foreign languages with little training data.
Mind, this is already possible. It’s just that I don’t have access to this technology. I sincerely hope that there will be no gatekeeping to the training data, such that we can train such models ourselves.
The problem with this is the same problem news websites has when they started switching out their foreign language writers with AI.
Just because you can translate what is literally being said word by word, doesn't mean you're translating the intent of what was being said.
Idioms, phrases, jokes, pleasantries, etc. won't translate into foreign languages no matter how well you can translate the literal words being said.
If you want good quality translate, you should get someone who knows the language and the culture to do the translation, so they can translate between the lines.
I honestly think this a non-issue with the new llms coming out. Gpt 4 definitely understands idioms.
Honestly, I agree. Machine translation isn’t by necessity limited to “literal” translations anymore.
There’s probably a strong English bias to that currently, but other languages will come with time
Shows with the budget/intent to create good quality translations will have them reviewed/refined by humans before they put it back in the voice of the host, I don’t see why they couldn’t do that.
Shows without the budget or that just don’t care will use full-auto and I’m sure it will indeed suck.
I’m with the person in this thread that pointed out that, with this, instead of translators handling an impossible amount of work, now they can edit the output to match correctly and get more done.
Fighting the tech will fail, as history has shown. Integrating it in a healthy, useful way is what is needed.
Make it stop.
Fuck this whole shit.
What’s your beef with this?
Or is it just that it includes the letters AI that you have an issue?
You could argue that for major languages, where the translations would drive revenue, they should prefer to hire people to do the translations from within the target market - it would create some amount of economic opportunity rather than just being another way for the developed countries to suck up money on services from developing ones in particular.
My beef with this, is that Spotify is relentless with pushing podcasts. I’m not interested in podcasts. I just want them permanently gone from my Spotify for all of eternity, but alas, I can’t get rid of them. When they start pumping out AI generated translations of popular podcasts, I can’t even imagine how hard they’ll push it.
I can choose “Music” and “Podcasts & Shows” on Home page on the mobile app at least, but that changes the feed massively and makes it useless. Spotify is such a trash app already, and I’m just waiting for an alternative that works in my country, but alas…
This is pretty great if the creators get the same cut…
We are one step closer to universal translation.
“Not in my lifetime, by choice.”
That’s going to cause so many lawsuits. Also wonder since the WAG strike finally finished and are creating a contract, if this will affect it?
Why do you think that? It sounds like it’s a feature that a Podcaster can choose to use if they want to. It doesn’t sound like they are just going to do it to every podcast without permission.
Honestly, as dumb as the AI hype can be, I see this as an actual good use of the tech, but I could be wrong.
Sure, as long as everyone gets paid!
After discovering my first AI covers (specifically Barbie Girl by Johnny Cash) a couple of weeks ago my first thought was “Yep, this is how Star Trek’s universal translator is about to come to pass.”
Didn’t even think about that but that’s a really good point
Thanks to AI, I guess Michael Jackson isn’t dead after all…
Here is an alternative Piped link(s):
I guess Michael Jackson isn’t dead after all…
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source; check me out at GitHub.
This pseudoAI is a new kind of plastic: sometimes useful, misused to infest everything with it. As it rolls, there would be less and less genuine content in a sea of garbage. That few, it’d become a luxury.
Technological advance is in hands of those who own the means of production.
Is this good or bad. I can see this being used to steal your voice and use it without your permission.
Assuming that nothing nefarious happens, I can still see this being a problem if the translations aren’t top quality. Imagine that speakers of another language are offended or you’re embarrassed in front of them because something you said was incorrectly translated; then it’s rendered in your voice so it seems you said it.
Handle it just like horror podcasts usually include disclaimers and warnings. Disclaimers before and after the podcast. Disclaimers in the podcast description. Notices in the ToS.
“This podcast has been translated into *your language* with the help of OpenAI. This is an automated service. As such, it may contain transcription and translation errors which may result in dialogue not intended by the original podcaster. Please report errors to *support link here*.”
Be more concerned about this being like what Hollywood just pulled, where Spotify includes a usage clause that gives them the rights to the podcaster’s voice in perpetuity.
And, it doesn't even need to be wrong. Sometimes very innocent things have a specific meaning or connotation in certain languages. Be it innuendos or euphemisms.
Using 3/5 in connection with Black people would mean basically nothing in Germany, but would perk up ears in the USA. On the other hand 18 and 88 is not that well known in the USA as anything particular, but in Germany you can't have it easily on your car plate, especially if you're from Hamburg (HH).
So you could quite correctly translate things, but they still get a different connotation depending on culture and language.
Perhaps that could be resolved by a disclaimer. Something like, “The following lyrics were generated by an AI and thus may be mistranslated.” It wouldn’t be perfect, but it might help.
Currently it’s an opt-in tool, and I don’t think it is likely OpenAI or Spotify blatantly steal voices. The fact that the tech exists enables that though, a podcast is a perfect training tool for it. But you can’t really uncreate it.
It’s also the sort of thing that unions have been fighting. It improves the technology and makes it an easier sell for any studios or producers to use it elsewhere, like to replace the need to pay a celebrity to come in and record radio station call outs, and long term this specifically takes away jobs from people who translate and dub audio.
IMO it’s good it’s opt-in but ultimately anti-human.
100% chance they already stole voices and sold them to either data harvesting or to sell and train ai models and never passed that money along.
It would help with accessibility, and it might help protect some lesser spoken languages because those people can grow an audience as well.
The tech will develop regardless and people will abuse it for other means, at least this one feels like a positive use as opposed to say, a company making its own podcast series with a stolen voice.
If the creator can choose to generate other languages for their own voice, it feels ok
In the short term, AI is only trained on popular languages like Spanish. It will not help less common ones.
Anyone can copy there voices without permission currently. Seems more like a useful service as long as the terms and conditions don’t include anything about signing your rights away by using it
Oh sure it has that provision that it becomes property of Spotify and they can use it however they like.
If it does a good job and people get paid fairly then this seems like a great thing to me.
I hate how many ads they push for podcasts and singles on the premium tier. Full screen. IDGAF, I just wanna listen to my music. Bracing for a wave of new duo ads, podcasts about a woman who sat on a fork or some BS like that, and artists I dislike. Now with AI translations :|
You pay for premium and they’re still serving you ads?
Every day I feel better about never having used Spotify.
There is a recommended for you section on the main page, but you can ignore it. They aren’t inserting ads into the listening part.
Nope. I don’t support blatantly public facing AI’s that take creative jobs away from people. I don’t care if it’s opt-in. I don’t care if the podcast creator themselves activates it. Exploiting the technology will only make it normalized, meaning we’ll care less about allowing humans to be creative in the future.
It seems easy to take this position as a native English speaker, but what if you aren’t proficient in English, perhaps only in a smaller regional language that doesn’t have the same nearly infinite pool of content? This is a potential game changer for that, allowing you to listen to thousands of podcasts you never could before. No jobs were lost because there was never anyone doing the translations in the first place. When viewed this way, it’s an accessibility feature.
What?
Ronald would like me to tell you that Seamus told him that Dean was told by Parvati that Hagrid’s looking for you.
I agree but it’s inevitable.
I have a strong feeling the terms of usage for this opt-in will include something along the lines of “we can use your voice for our future projects” and then in a few years they will just create podcasts using podcasters’ voices without their true consent and make a ton off their backs while increasing their competition.
That is of course the danger… as it is it’s pretty benign, allowing more people to consume podcasts in their own language. But the terms need to be clear.
And I am certain the terms will be clear and concise, definitely less than 50 pages and no vague and contradicting statements all over.
“A partnership with OpenAI”. I stopped reading. Probably shouldn’t but god damn.
Sounds like a terrible idea
Why?
Does this mean I can listen to my podcasts in Klingon? Time to get Duolingo out again
I have mix feeling about this, I have many English podcasts that I would love to recommend to my non-english speaking friends, so I feel very excited about this idea. But again, I felt the podcasters are being abused in someway with this.
Are the podcasters getting paid for these translated versions? If so, and at the same rate, then I don’t personally see an issue. If not, then yes, it’s exploitation.
I saw nothing in the article about if the podcasters will be getting any pay or anything of the sorts for this kind of stuff, but so long as they’re getting paid for opting in (assuming it’s opt in) when this comes available for everyone I don’t mind this as much. This is a use of AI I can get behind, at least if the podcasters get paid while using it.
Henry Zebrowski in Spanish is going to be something else.
All my years learning English wasted.
/s
Very very very terrible idea.
If the podcast creator consents, what’s the problem? I don’t understand why anti-AI sentiment is so prevalent among some people.
Its between the “it might take away jobs” to “spotify might use podcasters voices without consent”. I’m more on the latter but thats as if Spotify would end up being the “only” podcast streaming platform.
Very cool tech that could potentially do a lot of good.
However, we’re talking about AI and big platforms here, so usual questions apply:
Spotify is moving slowly and carefully_ for now_,
Is that so? As far as I know, the last few years they’ve been turning formerly open podcasts (you know, using the official podcast standard, xml feeds and all) into Spotify exclusives. So that you can only access them with an account (profiling), and have to listen to ads or pay for premium.
You’re giving them too much credit / good faith, imho.
Haha, I absolutely agree. They’re a platform company, and well… platforms gonna platform.
I’ve just been trying to keep my powder dry when it comes to AI discussions on Lemmy. There are a lot of users on Lemmy that are unconditionally pro-AI, so I don’t wanna make too many assertions beyond my core criticisms.
Good
Mixed feelings myself.
I think this is a GREAT application for AI.
But I worry that the creators will get screwed (monetarily) for the use of this. I could see this coming in a number of forms to include in losing the rights to the shows they made, but that were translated to their non-native language.
This will probably work great with comedy podcasts lol
I’m about to hear a lot more of José “hongos mágicos” Rogan outside of the internet now.
I don’t see anything wrong with that as long as it stays opt in only.
just use spotube
cooopsspace@infosec.pub 1 year ago
Honestly, as long as the person whose voice it is gives full permission it’s probably one great use for AI.
That being said, you could just hire people who actually know the language to translate.
argo_yamato@lemm.ee 1 year ago
I am for hiring people who know the language and the target audience. Mainly to avoid something literally translated that either doesn’t make sense or ends up being offensive by accident.
0xD@infosec.pub 1 year ago
You will never ever in any case be able to stop technology from progressing. Instead of fearing the loss of jobs, how about making sure that we can properly handle and integrate AI into our society with everyone benefitting from it?
Stop the defeatist attitude, get politically active and help kick conservatives and fascists into the ditch where they belong.
Vorticity@lemmy.world 1 year ago
As the other person said, we’re not going to be able to avoid this kind of change and 8 don’t think we should want to. There are more podcasts to translate than can possibly be done without AI.
A better use of translators, in my opinion, is as editors. Listen to the AI result while reading the English transcript to fix the types of problems that you mention.
TvanBuuren@feddit.nl 1 year ago
Just throwing this in here because it reminded me of it.
Image
DogMuffins@discuss.tchncs.de 1 year ago
If it was feasible to do that we would’ve been doing it already.
An AI makes to cost effective to translate audio for an audience of just a few people.
In cases where it has been cost effective to pay a translator in the past I expect it will continue to be so. I’m aware that AI generated audio is pretty good, but translations are often pretty poor.
csolisr@communities.azkware.net 1 year ago
It can be both at the same time - getting a professional voice actor to translate the script, then apply AI magic to have the voices match the original as exactly as possible.
BarrierWithAshes@kbin.social 1 year ago
Yeah this must be opt-in only.
T3rr4T3rr0r@kbin.social 1 year ago
If it isn't the amount of backlash will be insane