Comment on AI companies are violating a basic social contract of the web and and ignoring robots.txt

<- View Parent
BrianTheeBiscuiteer@lemmy.world ⁨7⁩ ⁨months⁩ ago

If it doesn’t get queried that’s the fault of the webscraper. You don’t need JS built into the robots.txt file either. Just add some line like:

here-there-be-dragons.html

Any client that hits that page (and maybe doesn’t pass a captcha check) gets banned. Or even better, they get a long stream of nonsense.

source
Sort:hotnewtop