Comment on Anubis is awesome! Stopping (AI)crawlbots
RedBauble@sh.itjust.works 4 days agoYou can setup the policies to allow search engines through, the default policy linked in the docs does that
Comment on Anubis is awesome! Stopping (AI)crawlbots
RedBauble@sh.itjust.works 4 days agoYou can setup the policies to allow search engines through, the default policy linked in the docs does that
danielquinn@lemmy.ca 4 days ago
This all appears to be based on the user agent, so wouldn’t that mean that bad-faith scrapers could just declare themselves to be typical search engine user agent?
SheeEttin@lemmy.zip 4 days ago
Yes. There’s no real way to differentiate.
SorteKanin@feddit.dk 4 days ago
Actually I think most search engine bots publish a list of verified IP addresses where they crawl from, so you could check the IP of a search bot against that to know.
SorteKanin@feddit.dk 4 days ago
Most search engine bots publish a list of verified IP addresses where they crawl from, so you could check the IP of a search bot against that to know.