Apple’s scrapped on-device CSAM scanning was based on perceptual hashes.
The first collision demo breaking them showed up in hours with images that looked glitched. After just a week the newest demos produced flawless images with collisions against known perceptual hash values.
In theory you could create some ML-ish compact learning algorithm and use the compressed model as a perceptual hash, but I’m not convinced this can be secure enough unless it’s allowed to be large enough, as in some % of the original’s file size.
pupbiru@aussie.zone 9 months ago
you can definitely produced perceptual hashes that collide, but really you’re not just talking about a collision, you’re talking about a collision that’s also useful in subverting an election, AND that’s been generated using ML which is something that’s still kinda shakey to start with
Natanael@slrpnk.net 9 months ago
Perceptual hash collision generators can take arbitrary images and tweak them in invisible ways to make them collide with whichever hash value you want.
pupbiru@aussie.zone 9 months ago
from the comment above, it seems like it took a week for a single image/frame though… it’s possible sure but so is a collision in a regular hash function… at some point it just becomes too expensive to be worth it, AND the phash here isn’t being used as security because the security is that the original was posted on some source of truth site (eg the whitehouse)
Natanael@slrpnk.net 9 months ago
No, it took a week to refine the attack algorithm, the collision generation itself is fast
The point of perceptual hashes is to let you check if two things are similar enough after transformations like scaling and reencoding, so you can’t rely on that here