More like… Degenerative AI *ba dum tsss
“Model collapse” threatens to kill progress on generative AIs
Submitted 1 month ago by Stern@lemmy.world to technology@lemmy.world
https://bigthink.com/the-future/ai-model-collapse/
Comments
EgoNo4@lemmy.world 1 month ago
merde@sh.itjust.works 1 month ago
deGenerative AI ☞ !degenerate@lemmynsfw.com
EgoNo4@lemmy.world 1 month ago
No idea this existed.
Also… JFC WHAT THE SHIT?
CarbonatedPastaSauce@lemmy.world 1 month ago
Model collapse is just a euphemism for “we ran out of stuff to steal”
Snowclone@lemmy.world 1 month ago
It’s more ''we are so focused on stealing and eating content, we’re accidently eating the content we or other AI made, which is basically like incest for AI, and they’re all inbred to the point they don’t even know people have more than two thumb shaped fingers anymore.
rottingleaf@lemmy.world 1 month ago
All such news make me want to live to the time when our world is interesting again. Real AI research, something new instead of the Web we have, something new instead of the governments we have. It’s just that I’m scared of what’s between now and then. Parasites die hard.
jimmy90@lemmy.world 1 month ago
or “we’ve hit a limit on what our new toy can do and here’s our excuse why it won’t get any better and AGI will never happen”
RmDebArc_5@sh.itjust.works 1 month ago
This sounds like AI is literally biting its own tail
AbidanYre@lemmy.world 1 month ago
ChatGPT, what is an ouroboros?
homesweethomeMrL@lemmy.world 1 month ago
Of course! An ChatGPT is an ouroboros, ChatGPT what is an ouroboros.
casmael@lemm.ee 1 month ago
…………………. Good?
emiellr@lemm.ee 1 month ago
Tbh I’m a bit lost on the purpose of this
aggelalex@lemmy.world 1 month ago
So AI:
- Scraped the entire internet without consent
- Trained on it
- Polluted it with AI generated rubbish
- Trained on that rubbish without consent
- Are now in need of lobotomy
rickdg@lemmy.world 1 month ago
Old news? Seems to be a subject of several papers for some time now. Synthetic data has been used successfully already for very specific domains.
SomeGuy69@lemmy.world 1 month ago
Yup, old news and wrong news. Also so many people who hate AI but don’t understand how it works. Pretty disappointing for a technology community.
thejml@lemm.ee 1 month ago
Ah, the Hapsburg of AI!
Telorand@reddthat.com 1 month ago
Oh, the artificial humanity!
PapaStevesy@lemmy.world 1 month ago
I like to think of it like a Mad Cow or Kuru, you can’t eat your own species’s brains or you could get a super lethal, contagious prion disease.
CarbonatedPastaSauce@lemmy.world 1 month ago
Prion diseases aren’t contagious.
ZILtoid1991@lemmy.world 1 month ago
If only the generated output also looked more and more like how inbred humans do.
Like insane rambling from LLMs, and the humans generated by AI had various developmental disorders and the Habsburg jaw.
gravitas_deficiency@sh.itjust.works 1 month ago
Uh, good.
As an engineer who cares a LOT about engineering ethics, it is absolutely fucking infuriating watching the absolute firehose of shit that comes out of LLMs and public-consumption ML audio, image, and video ML systems, juxtaposed with the outright refusal of companies and engineers who work there to accept ANY accountability or culpability for the systems THEY FUCKING MADE.
I understand the nuances of NNs. I understand that they’re much more stochastic than deterministic. So, you know, maybe it wasn’t a great idea to just tell the general public (which runs a WIDE gamut of intelligence and comprehension ability - not to mention, morality) “have at it”. The fact that ML usage and deployment in terms of information generating/kinda-sorta-but-not-really-aggregating “AI oracles” isn’t regulated on the same level as what you’d see in biotech or aerospace is insane to me. It’s a refusal to admit that these systems fundamentally change the entire premise of how “free speech” is generated, and that bad actors (either unrepentantly profit driven, or outright malicious) can and are taking disproportionate advantage of these systems.
I get it - I am a staunch opponent of censorship, and as a software engineer. But the flippant deployment of literally society-altering technology alongside the outright refusal to accept any responsibility, accountability, or culpability for what that technology does to our society is unconscionable and infuriating to me. I am aware of the potential that ML has - it’s absolutely enormous, and could absolutely change a HUGE number of fields for the better in incredible ways. But that’s not what it’s being used for, and it’s because the field is essentially unregulated right now.
ohellidk@sh.itjust.works 1 month ago
Cool, let’s try to ruin it faster!
draughtcyclist@lemmy.world 1 month ago
I’ve been assuming this was going to happen since it’s been haphazardly implemented across the web. Are people just now realizing it?
DeathbringerThoctar@lemmy.world 1 month ago
People are just now acknowledging it. Execs tend to have a disdain for the minutiae. They’re like kids that only want to do the exciting bits. As a result things get fucked because they don’t really understand what they’re doing. As Muskrat would say “move fast and break things.” It’s a terrible mindset.
pixxelkick@lemmy.world 1 month ago
“Move Fast and Break Things” is Zuckerberg/Facebook motto, not Musk, just to note.
pyre@lemmy.world 1 month ago
oh no are we gonna have to appreciate the art of human beings? ew. what is they want compensation‽
Lettuceeatlettuce@lemmy.ml 1 month ago
Good.
Hugin@lemmy.world 1 month ago
The solution for this is usually counter training. Granted my experience is on the opposite end training ai vision systems to id real objects.
So you train up your detector ai on hand tagged images. When it gets good you use it to train a generator ai until the generator is good at fooling the detector.
Then you train the detector on new tagged real data and the new ai generated data. Once it’s good at detection again you train the generator ai on the new detector.
Repeate several times and you usually get a solid detector and a good generator as a side effect.
The thing is you need new real human tagged data for each new generation. None of the companies want to generate new human tagged data sets as it’s expensive.
TheReturnOfPEB@reddthat.com 1 month ago
have we tried feeding them actual human beings yet ?
n3m37h@sh.itjust.works 1 month ago
Billionaires are the smartest, give them the most knowledge first!
SkyNTP@lemmy.ml 1 month ago
I think anyone familiar with the laws of thermodynamics could have predicted this outcome.
mint_tamas@lemmy.world 1 month ago
Explain?
Draconic_NEO@lemmy.world 1 month ago
Second law of thermodynamics:
II. Total amount of entropy in a closed system always increases with time. Entropy can never be negative.
NotInTheFace@lemmy.world 1 month ago
Looks like that artist drawing self portraits as his alzheimer got worse and worse.
NocturnalMorning@lemmy.world 1 month ago
It’s basically AI alzheimers
homesweethomeMrL@lemmy.world 1 month ago
AIzheimers?
levzzz@lemmy.world 1 month ago
Fake news, just like that one time Nightshade “killed” stable diffusion (literally had no effect) Flux came out not long ago and it’s better than ever
Sabata11792@ani.social 1 month ago
At this point the synthetic data is good enough to intentionally be used for training LLMs.
Honytawk@lemmy.zip 1 month ago
Yeah, just filter out the bad generated images and feed the good ones again, until the model learns how to produce only good ones.
PrivacyDingus@lemmy.world 1 month ago
this headline truly is threatening me with a good time
pastermil@sh.itjust.works 1 month ago
More like degenerative AIs
nullPointer@programming.dev 1 month ago
when all your information conflicts with itself, you really have no information at all.
tee9000@lemmy.world 1 month ago
Kind of like how true thoughts and opinions on complex topics are boiled down to digestible concepts for others to understand who then perpetuate those concepts without understanding them and the meaning degrades and we dont think anymore, just repeat stuff in social media comments.
Side note… this article sucks and seems like it was ai generated. Repetitive and no author credit? Just says it was originally posted elsewhere.
Generative AI isnt in danger of being killed as this clickbait titled suggests… just hindered.
FarFarAway@lemmy.world 1 month ago
Theres a link to the other article, in this article. Says Kristin Houser wrote it…although you may have a point about the rest.
tee9000@lemmy.world 1 month ago
ty
General_Effort@lemmy.world 1 month ago
hindered.
I doubt that.
NutWrench@lemmy.world 1 month ago
Anyone who has made copies of videotapes knows what happens to the quality of each successive copy. You’re not making a “treasure trove.” You’re making trash.
BrightCandle@lemmy.world 1 month ago
Having now flooded the internet with bad AI content not surprisingly its now eating itself. Numerous projects that aren’t AI are suffering too as the quality of text reduces.
Jomega@lemmy.world 1 month ago
Good riddance.
todd_bonzalez@lemm.ee 1 month ago
Our wetware neutral networks probably aren’t supposed to engage with synthetic content like this either. In a few years we’re gonna learn that overexposure to AI generated content creates some sort of neurological problem in people, like a real-world “nerve attenuation syndrome” (Johnny Mnemonic).
TheHarpyEagle@pawb.social 1 month ago
I’ve read some snippets of AI written books and it really does feel like my brain is short circuiting
njordomir@lemmy.world 1 month ago
It’s like a human centipede where only the first person is a human and everyone else is an AI. It’s all shit, but it gets a bit worse every step.
celsiustimeline@lemmy.dbzer0.com 1 month ago
If mainstream blogs are writing about it, what makes you think that AI companies haven’t thoroughly dissected the problem and are already working on filtering out AI fingerprints from the training data set? If they can make a sophisticated LLM, chances are they can find methods to XOR out generated content.
mac@lemm.ee 1 month ago
is it not relatively trivial to pre-vet content before they train it? at least with aigen text it should be.
RvTV95XBeo@sh.itjust.works 1 month ago
The problem is these AI companies currently exist on the business model of not paying for information, and that generally includes not wanting to pay content curators.
Google is probably the only one in a position to potentially outsource by making everyone solve a “does this hand look normal to you” CAPTCHA
They can try and train AI to detect AI, but that’s also difficult.
FMT99@lemmy.world 1 month ago
So it’s not a problem with AI. It’s just a problem for some mayfly companies that try to profit from the latest trend?
General_Effort@lemmy.world 1 month ago
It depends on what you are looking for. Identifying AI generated data is generally hard, though it can be done in specific cases. There is no mathematical difference between the 1s and 0s that encoded AI generated data and any other data. Which is why these model collapse ideas are just fantasy. There is nothing magical about any data that makes it “poisonous” to AI. The kernel of truth behind these ideas is not likely to matter in practice.
barsquid@lemmy.world 1 month ago
It is their own fault for poisoning the internet with their slop.
db2@lemmy.world 1 month ago
In case anyone doesn’t get what’s happening, imagine feeding an animal nothing but its own shit.
Stern@lemmy.world 1 month ago
I use the “Sistermother and me are gonna have a baby!” example personally, but I am a awful human so
BassTurd@lemmy.world 1 month ago
Not shit, but isn’t that what brought about mad cow disease? Farmers were feeding cattle brain matter that had infected prions. Idk if it was cows eating cow brains or other animals though.
leftzero@lemmynsfw.com 1 month ago
Photocopy of a photocopy is my go-to metaphor for model collapse.
Cock_Inspecting_Asexual@lemmy.world 1 month ago
DUDE ITS SO FUCKING ANNOYING TRYNNA USE GOOGLE IMAGES ANYMORE–
ALL IT GIVES ME IS AI ART. IM SO FUCKING SICK AND TIRED OF IT.