I made this point recently in a much more verbose form, but I want to reflect it briefly here, if you combine the vulnerability this article is talking about with the fact that large AI companies are most certainly stealing all the data they can and ignoring our demands to not do so the result is clear we have the opportunity to decisively poison future LLMs created by companies that refuse to follow the law or common decency with regards to privacy and ownership over the things we create with our own hands.
Whether we are talking about social media, personal websites… whatever if what you are creating is connected to the internet AI companies will steal it, so take advantage of that and add a little poison in as thank you for stealing your labor :)
ceenote@lemmy.world 2 weeks ago
So, like with Godwin’s law, the probability of a LLM being poisoned as it harvests enough data to become useful approaches 1.
Gullible@sh.itjust.works 2 weeks ago
I mean, if they didn’t piss in the pool, they’d have a lower chance of encountering piss. Godwin’s law is more benign and incidental. This is someone maliciously handing out extra Hitlers in a game of secret Hitler and then feeling shocked at the breakdown in the game
saltesc@lemmy.world 2 weeks ago
Yeah but they don’t have the money to introduce quality governance into this. So the brain trust of Reddit it is. Which explains why LLMs have gotten all weirdly socially combative too; like two neckbeards having at it with Google skill vs Google skill is a rich source of A+++ knowledge and social behaviour.
UnderpantsWeevil@lemmy.world 2 weeks ago
Hey now, if you hand everyone a “Hitler” card in Secret Hitler, it plays very strangely but in the end everyone wins.
Clent@lemmy.dbzer0.com 2 weeks ago
The problem is the harvesting.
In previous incarnations of this process they used curated data because of hardware limitations.
Now that hardware has improved they found if they throw enough random data into it, these complex patterns emerge.
The complexity also has a lot of people believing it’s some form of emergent intelligence.
Research shows there is no emergent intelligence or they are incredibly brittle such as this one. Not to mention they end up spouting nonsense.
These things will remain toys until they get back to purposeful data inputs. But curation is expensive, harvesting is cheap.
julietOscarEcho@sh.itjust.works 2 weeks ago
Isn’t “intelligence” so ill defined we can’t prove it either way. All we have is models doing better on benchmarks and everyone shrieking “look emergent intelligence”.
I disagree a bit on “toys”. Machine summarization and translation is really quite powerful, but yeah that’s a ways short of the claims that are being made.