cupcakezealot@piefed.blahaj.zone 1 month ago
add this to your ublock, ublacklist, or adguard router rules: https://github.com/laylavish/uBlockOrigin-HUGE-AI-Blocklist
add this to your robots.txt on any sites you own: https://robotstxt.com/ai
cupcakezealot@piefed.blahaj.zone 1 month ago
add this to your ublock, ublacklist, or adguard router rules: https://github.com/laylavish/uBlockOrigin-HUGE-AI-Blocklist
add this to your robots.txt on any sites you own: https://robotstxt.com/ai
GreenKnight23@lemmy.world 1 month ago
this is great! I was just thinking about this yesterday.
ohshit604@sh.itjust.works 1 month ago
If your reverse proxy is Traefik I would suggest This plugin which pulls the robot.txt from This GitHub repository.
I honestly should’ve setup a robots.txt a long time ago.
xthexder@l.sw0.com 1 month ago
Unfortunately robots.txt only stops the well behaved scrapers. Even with disallow all, you’ll still get loads of bots. Setting up the web server to block those user agents would work a bit better, but even then there’s bots out there crawling using regular browser user agents.