Comment on A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It
ulterno@programming.dev 2 days agoAnother point is, the reason Google’s AI is able to identify CSAM is because it has that in its training data, flagged as such.
In that case, it would have detected the training material as ~100% match.
I don’t get though, how it ended up being openly available as if it were properly tagged, they would probably exclude it from the open-sourced data. And now I see it would also not be viable to have an open-source, openly scrutinisable AI deployment for CSAM detection for the same reason.
And while some governmental body got a lot of backlash for trying to implement such an AI thing on chat stuff, Google gets to do so all it wants because it’s E-Mail/GDrive and all on their servers and you can’t expect privacy.
arararagi@ani.social 2 days ago
You would think, but none of these companies actually make their own dataset, they buy from third parties.
ulterno@programming.dev 2 days ago
I am not sure which point you are answering to.
COuld you please specify.