Out of curiosity how do you guarantee you’ve stripped out all identifiable marks if you don’t know they’re there?
Not that I doubt your claim, but I used to water mark screeners for pre-release movies so if they turned up on torrent sites we’d know where they leaked from. We used unique pixel markings on pre-selected frames. I couldn’t imagine how anyone would know to look for them or recognize them for what they were in a 2k image unless they already knew what and where to look.
quick_snail@feddit.nl 5 days ago
Text documents can be retyped lol.
obsoleteacct@lemmy.zip 5 days ago
According to Senator Durbin there are over 100,00 “files”. It would take thousands of hours.
You could use a script, but then you’re back to the same problem. You still have to ensure nothing’s coded into it.
I think the best you could do with 100% certainty is cherry pick select documents if you had the ability to search them.
quick_snail@feddit.nl 5 days ago
OCR works fine. Pixel signatures like fucked TI and Reality wouldn’t get picked up
obsoleteacct@lemmy.zip 5 days ago
But Unicode or text based identifiers might.