According to Senator Durbin there are over 100,00 “files”. It would take thousands of hours.
You could use a script, but then you’re back to the same problem. You still have to ensure nothing’s coded into it.
I think the best you could do with 100% certainty is cherry pick select documents if you had the ability to search them.
quick_snail@feddit.nl 5 days ago
OCR works fine. Pixel signatures like fucked TI and Reality wouldn’t get picked up
obsoleteacct@lemmy.zip 5 days ago
But Unicode or text based identifiers might.