I found this study, it looked promising but I think it only works on the one LLM they were targeting. Also they seem to be working to protect ai models so results they find will probably be implemented as ways to protect against poisoning. I guess intentional dataset poisoning hasn’t come as far as I hoped
Comment on Microsoft sets Copilot agents loose on your OneDrive files
AliasAKA@lemmy.world 3 weeks agoThat won’t poison an LLM exactly.
www.anthropic.com/research/small-samples-poison#%….
Theoretically this is a place to start. They probably have mitigations for many of these.
sad_detective_man@sopuli.xyz 3 weeks ago
Ghostie@lemmy.zip 2 weeks ago
Interesting. Imagine if OneDrive users did this with the trigger phrase as the word “and” or some other general conjunction that is required for language to work.
halcyoncmdr@piefed.social 3 weeks ago
Have you seen the state of testing for Microsoft products nowadays? Or rather the apparently complete lack of testing.