Beyond the copyright issues and energy issues, AI does some serious damage to your ability to do actual hard research. And I'm not just talking about "AI brain."
Let's say you're looking to solve a programming problem. If you use a search engine and look up the question or a string of keywords, what do you usually do? You look through each link that comes up and judge books by their covers (to an extent). "Do these look like reputable sites? Have I heard of any of them before?" You scroll click a bunch of them and read through them. Now you evaluate their contents. "Have I already tried this info? Oh this answer is from 15 years ago, it might be outdated." Then you pare down your links to a smaller number and try the solution each one provides, one at a time.
Now let's say you use an AI to do the same thing. You pray to the Oracle, and the Oracle responds with a single answer. It's a total soup of its training data. You can't tell where specifically it got any of this info. You just have to trust it on faith. You try it, maybe it works, maybe it doesn't. If it doesn't, you have to write a new prayer try again.
Even running a local model means you can't discern the source material from the output. This isn't Garbage In Garbage Out, but Stew In Soup Out. You can feed an AI a corpus of perfectly useful information, but it will churn everthing into a single liquidy mass at the end. And because the process is destructive, you can't un-soup the output. You've robbed yourself of the ability to learn from the input, and put all your faith into the Oracle.
eldebryn@lemmy.world 7 months ago
Out of legit curiosity, how many models do you know trained exclusively on public domain data, which are actually useful?
lime@feddit.nu 7 months ago
anything trained on common corpus. which, oddly, is harder to find than the actual training data.
eldebryn@lemmy.world 7 months ago
I mean this respectfully, but that wasn’t an actual answer.
lime@feddit.nu 7 months ago
no, it sort of reinforced your point.