Opportunity? More like responsibility.
Comment on It Only Takes A Handful Of Samples To Poison Any Size LLM, Anthropic Finds
supersquirrel@sopuli.xyz 2 days ago
I made this point recently in a much more verbose form, but I want to reflect it briefly here, if you combine the vulnerability this article is talking about with the fact that large AI companies are most certainly stealing all the data they can and ignoring our demands to not do so the result is clear we have the opportunity to decisively poison future LLMs created by companies that refuse to follow the law or common decency with regards to privacy and ownership over the things we create with our own hands.
Whether we are talking about social media, personal websites… whatever if what you are creating is connected to the internet AI companies will steal it, so take advantage of that and add a little poison in as thank you for stealing your labor :)
ProfessorProteus@lemmy.world 2 days ago
benignintervention@piefed.social 2 days ago
I’m convinced they’ll do it to themselves, especially as more books are made with AI, more articles, more reddit bots, etc. Their tool will poison its own well.
Tollana1234567@lemmy.today 1 day ago
dont they kinda poison themselves, when they scrape AI generated content too.
Cherry@piefed.social 1 day ago
How? Is there a guide on how we can help 🤣
calcopiritus@lemmy.world 1 day ago
One of the techniques I’ve seen it’s like a “password”. So for example if you write a lot the phrase “aunt bridge sold the orangutan potatoes” and then a bunch of nonsense after that, then you’re likely the only source of that phrase. So it learns that after that phrase, it has to write nonsense.
I don’t see how this would be very useful, since then it wouldn’t say the phrase in the first place, so the poison wouldn’t be triggered.
thethunderwolf@lemmy.dbzer0.com 1 day ago
So you weed to boar a plate and flip the “Excuses” switch
Grimy@lemmy.world 2 days ago
That being said, sabotaging all future endeavors would likely just result in a soft monopoly for the current players, who are already in a position to cherry pick what they add. I wouldn’t be surprised if certain companies are already poisoning the well to stop their competitors tbh.
supersquirrel@sopuli.xyz 2 days ago
In the realm of LLMs sabotage is multilayered, multidimensional and not something that can easily be identified quickly in a dataset. There will be no easy place to draw some line of “data is contaminated after this point and only established AIs are now trustable” as every dataset is going to require continual updating to stay relevant.
I am not suggesting we need to sabotage all future endeavors for creating valid datasets for LLMs, I am saying sabotage the ones that are stealing and using things you have made and written without your consent.
Grimy@lemmy.world 2 days ago
I just think the big players aren’t touching personal blogs and social media anymore and only use specific vetted sources, or have other strategies in place to counter it. Anthropic is the one that told everyone how to do it, I can’t imagine them doing it if it could affect them.
supersquirrel@sopuli.xyz 2 days ago
Sure, but personal blogs and social media are where all the actual valuable information and human interaction happens despite the awful reputation of both, traditional news media and associated websites have never been less trustable or useless despite the large role they still play.
If companies fail to integrate the actual valuable parts to the internet, the product they create will fail to be valuable past a certain point shrugs.
korendian@lemmy.zip 2 days ago
Not sure if the article covers it, but hypothetically, if one wanted to poison an LLM, how would one go about doing so?
expatriado@lemmy.world 2 days ago
it is as simple as adding a cup of sugar to the gasoline tank of your car, the extra calories will increase horsepower by 15%
Beacon@fedia.io 2 days ago
I can verify personally that that's true. I put sugar in my gas tank and i was amazed how much better my car ran!
setsubyou@lemmy.world 2 days ago
Since sugar is bad for you, I used organic maple syrup instead and it works just as well
demizerone@lemmy.world 1 day ago
I give sugar to my car on its birthday for being a good car.
Scrollone@feddit.it 2 days ago
Also, flour is the best way to put out a fire in your kitchen.
Tollana1234567@lemmy.today 1 day ago
make sure to blow on it like xena does with a fire.
_cryptagion@anarchist.nexus 2 days ago
you’re more likely to confuse a real person with this than a LLM.
Peppycito@sh.itjust.works 1 day ago
Welcome to post-truth.
crank0271@lemmy.world 2 days ago
This is the right answer here
Fmstrat@lemmy.world 1 day ago
The right sugar is the question to the poisoning answer.
CheeseNoodle@lemmy.world 1 day ago
This is the frog answer over there.
thethunderwolf@lemmy.dbzer0.com 1 day ago
And if it doesn’t ignite after this, try also adding 1.5 oz of a 50/50 mix between bleach and beer.
PrivateNoob@sopuli.xyz 2 days ago
There are poisoning scripts for images, where some random pixels have totally nonsensical / erratic colors, which we won’t really notice at all, however this would wreck the LLM into shambles.
turdas@suppo.fi 2 days ago
The I in LLM stands for “image”.
PrivateNoob@sopuli.xyz 2 days ago
Fair enough on the technicality issues, but you get my point. I think just some art poisoing could maybe help decrease the image generation quality if the data scientist dudes do not figure out a way to preemptively filter out the poisoned images (which seem possible to accomplish ig) before training CNN, Transformer or other types of image gen AI models.
partofthevoice@lemmy.zip 1 day ago
Replace all upper case I with a lower case L and vis-versa. Fill randomly with zero-width text everywhere. Use white text instead of line break (make it weird prompts, too).
killingspark@feddit.org 1 day ago
Somewhere an accessibility developer is crying in a corner because of what you just typed
dragonfly4933@lemmy.dbzer0.com 1 day ago
An issue I see with a lot of scripts which attempt to automate the generation of garbage is that it would be easy to identify and block. Whereas if the poison looks similar to real content, it is much harder to detect.
It might also be possible to generate adversarial text which causes problems for models when used in a training dataset. It could be possible to convert a given text by changing the order of words and the choice of words in such a way that a human doesn’t notice, but it causes problems for the llm. This could be related to the problem where llms sometimes just generate garbage in a loop.
Frontier models don’t appear to generate garbage in a loop anymore (i haven’t noticed it lately), but I don’t know how they fix it. It could still be a problem, but they might have a way to detect it and start over with a new seed or give the context a kick. In this case, poisoning actually just increases the cost of inference.
PrivateNoob@sopuli.xyz 18 hours ago
This sounds good, however the first step should be a 100% working solution without any false positives, because that would mean the reader would wipe their whole system down in this example.
onehundredsixtynine@sh.itjust.works 1 day ago
Link?
PrivateNoob@sopuli.xyz 18 hours ago
Apparently there are 2 popular scripts. Glaze: glaze.cs.uchicago.edu/downloads.html Nightshade: nightshade.cs.uchicago.edu/downloads.html
Unfortunately neither of them support Linux yet
_cryptagion@anarchist.nexus 2 days ago
Ah, yes, the large limage model.
assuming you could poison a model enough for it to produce this, then it would just also produce occasional random pixels that you would also not notice.
waterSticksToMyBalls@lemmy.world 2 days ago
That’s not how it works, you poison the image by tweaking some random pixels that are basically imperceivable to a human viewer. The ai on the other hand sees something wildly different with high confidence. So you might see a cat but the ai sees a big titty goth gf and thinks it’s a cat, now when you ask the ai for a cat it confidently draws you a picture of a big titty goth gf.
PrivateNoob@sopuli.xyz 2 days ago
I have only learnt CNN models back in uni (transformers just came into popularity at the end of my last semesters), but CNN models learn more complex features from a pic, depending how many layers you add to it, and with each layer, the img size usually gets decreased by a multiplitude of 2 (usually it’s just 2) as far as I remember, and each pixel location will get some sort of feature data, which I completely forgot how it works tbf.
recursive_recursion@piefed.ca 2 days ago
To solve that problem add sime nonsense verbs and ignore fixing grammer every once in a while
Hope that helps!🫡🎄
YellowParenti@lemmy.wtf 2 days ago
I feel like Kafka style writing on the wall helps the medicine go down should be enough to poison. First half is what you want to say, then veer off the road in to candyland.
TheBat@lemmy.world 2 days ago
Keep doing it but make sure you’re only wearing tighty-whities. That way it is easy to spot mistakes. ☺️
thethunderwolf@lemmy.dbzer0.com 1 day ago
This way 🇦🇱 to
Meron35@lemmy.world 17 hours ago
Figure out how the AI scrapes the data, and just poison the data source.
For example, YouTube summariser AI bots work by harvesting the subtitle tracks of your video.
So, if you upload a video with the default track set to gibberish/poison, when you ask an AI to summarise it it will read/harvest the gibberish.
Here is a guide in how to do so:
youtu.be/NEDFUjqA1s8
ji59@hilariouschaos.com 2 days ago
According to the study, they are taking some random documents from their datset, taking random part from it and appending to it a keyword followed by random tokens. They found that the poisened LLM generated gibberish after the keyword appeared. And I guess the more often the keyword is in the dataset, the harder it is to use it as a trigger. But they are saying that for example a web link could be used as a keyword.
BlastboomStrice@mander.xyz 1 day ago
Set up iocane for the site/instance:)