Comment on Perplexity AI is complaining their plagiarism bot machine cannot bypass Cloudflare's firewall
ubergeek@lemmy.today 2 days agoAnd I’m assuming if the robots.txt state their UserAgent isn’t allowed to crawl, it obeys it, right? :P
Comment on Perplexity AI is complaining their plagiarism bot machine cannot bypass Cloudflare's firewall
ubergeek@lemmy.today 2 days agoAnd I’m assuming if the robots.txt state their UserAgent isn’t allowed to crawl, it obeys it, right? :P
Kissaki@feddit.org 2 days ago
No, as per the article, their argumentation is that they are not web crawlers generating an index, they are user-action-triggered agents working live for the user.
ubergeek@lemmy.today 2 days ago
Except, it’s not a live user hitting 10 sights all the same time, trying to crawl the entire site… Live users cannot do that.
That said, if my robots.txt forbids them from hitting my site, as a proxy, they obey that, right?