the internet is full of ai generated text now, which is poison to training models. But it’s good at pretending.
This misconception shows up again and again. It’s wishful thinking from people who want to think AI researchers are idiots and AIs are going to kill themselves.
These models aren’t trained on “the internet”. They don’t just thoughtlessly rip everything that’s ever been posted every time they want to make an updated bot. The vast bulk of training data was scraped years ago, predating the current tide of generative muck, and additions are carefully curated to avoid the exact thing you’re talking about. A scrape of the 2018 internet is plenty, and will remain so for years and years.
FinishingDutch@lemmy.world 10 months ago
The thing that really annoys me is the people who are most enamoured with Chat GPT also seem to be the ones least capable of judging its accuracy and actual output quality.
I write for a living; a newspaper. So naturally, some of the people in our company - sales people - wanted to test it. And they were delighted with the stuff it wrote. Which was terrible to read, factually incorrect, repetitive and just not something we’d put in the paper. But they loved it. Because they weren’t writers and don’t know how to write an engaging article with proper sources.
I tested it as well. Wanted to form my own opinion and read up on the limitations, how to write good prompts, etc. So I could guve it a fair chance.
I had it write a basic 500 word article about things to see in our city, with information about the tourist info office. That’s something a first year intern can do in his second week with us.
Basically, it ended up ‘inventing’ two museums that don’t exist, it listed info for a museum on the other side of the country, it listed an ‘Olympic stadium’ (we never hosted the Olympics) and it gave a completely wrong address for the tourist info, even though it should have it.
It was factually incorrect in just about every sentence. But it all sounded plausible enough and was written with such confidence that anyone not from this city might assume it to be true.
I don’t want that fucking thing anywhere NEAR my newspaper. The sales people are pretty much monkeys with Chat GPT-typewriters, churning out drivel instead of Shakespeare.
LWD@lemm.ee 10 months ago
Sounds like the Gell-Mann Amnesia Effect. Except instead of a newspaper, you’re reading something not generated by humans.
Like the newspaper, though, I would argue that generative AI is being presented as if it knows everything about everything already, or at least collective inertia implies it does.