Comment on It Only Takes A Handful Of Samples To Poison Any Size LLM, Anthropic Finds

<- View Parent
wizardbeard@lemmy.dbzer0.com ⁨13⁩ ⁨hours⁩ ago

Then you have to create a framework for evaluating the effect of the addition of each source into “positive” or “negative”. Good luck with that. They can’t even map input objects in the training data to their actual source correctly or consistently.

It’s absolutely possible, but pretty much anything that adds more overhead per each individual input in the training data is going to be too costly for any of them to try and pursue.

O(n) isn’t bad, but when your n is as absurdly big as the training corpuses these things use, that has big effects. And there’s no telling if it would actually only be an O(n) cost.

source
Sort:hotnewtop