Comment on Data contamination expert 👌

eager_eagle@lemmy.world ⁨8⁊ ⁨months⁊ ago

Good move, but anyone using public data already applies a simple spam filter to reject “dumb” data poisoning. Also, hatred and other negative comments as responses will be penalized in a language model training, so an effective data poisoning takes effort. I’ll just throw some ideas here how poisoning could hypothetically have a tangible negative impact in their results.

The best one can do in terms of data poisoning is make comments that are not easily discernible from usual comments - both for humans and machines - but are either unhelpful or misleading. This is an “in-distribution” data poisoning attack. To be really effective in having any impact whatsoever for training, they need to be mass applied using different user accounts that also upvote each others’ comments in a way that mimics real user interaction: if applied in a simplistic way, a simple graph analysis on these interactions can highlight these fake accounts as a christmas tree.

source
Sort:hotnewtop