Comment on Child sex abuse images found in dataset training image generators, report says
snooggums@kbin.social 11 months agoAnything > 0 is too many.
Comment on Child sex abuse images found in dataset training image generators, report says
snooggums@kbin.social 11 months agoAnything > 0 is too many.
KoboldCoterie@pawb.social 11 months ago
While I agree with the sentiment, that’s 2-6 in 20,000,000 images; even if someone was personally reviewing all of the images that went into these data sets, which I strongly doubt, that’s a pretty easy mistake to make, when looking at that many images.
RecallMadness@lemmy.nz 11 months ago
“Known CSAM” suggests researchers ran it through automated detection tools which the dataset authors could have used.
SapphireVelvet84839@lemmynsfw.com 11 months ago
They’re not looking at the images though. They’re scraping. And their own legal defenses rely on them not looking too carefully else they cede their position to the copyright holders.
snooggums@kbin.social 11 months ago
Technically they violated the copyright of the CSAM creators!