Something that hasn’t been mentioned much in discussions about Anubis is that it has a graded tier system of how sketchy a client is and changing the kind of challenge based on a a weighted priority system. The default not policies it comes with has it so regular clients are passed through, only slightly weighted clients/IPs get the metarefresh, its when you get to moderate-suspicion level that JavaScript Proof of Work kicks. The bot policy and weight triggers for these levels, challenge action, and duration of clients validity are all configurable.
It seems to me that the sites who heavy hand the proof of work for every client with validity that only last every 5 minutes are the ones who are giving Anubis a bad wrap. The default policy settings dont trigger PoW on the Firefox android clients ive tried including Firefox meanwhile other sites show the finger wag every connection. Its understandable why some choose strict policies but I’m glad theres config options to mitigate impact normal user experience.
rtxn@lemmy.world 8 hours ago
That sentence tells me that you don’t understand the purpose of Anubis. It’s not to punish the scrapers. It is to reduce the load on the web server when it is flooded by scraper requests. Bots running headless Chrome can easily solve the challenge, but every second a client is working on the challenge is a second that the web server doesn’t have to waste CPU cycles on serving clankers.
POW is an inconvenience to users. The flood of scrapers is an existential threat to independent websites. And there is a simple fact that you conveniently ignored: it fucking works.
sudo@programming.dev 6 hours ago
Its like you didn’t understand anything I said. Anubis does work. I said it works. But it works because most AI crawlers don’t have a headless browser to solve the PoW. To operate efficiently at the high volume required, they use raw http requests. The vast majority are probably using basic python
requestsmodule.You don’t need PoW to throttle general access to your site and that’s not the fundamental assumption of PoW. PoW assumes (incorrectly) that bots won’t pay the extra flops to scrape the website. But bots are paid to scape the website users aren’t. They’ll just scale horizontally and open more parallel connections. They have the money.
poVoq@slrpnk.net 6 hours ago
You are arguing a strawman. Anubis works because because most AI scrapers (currently) don’t want to spend extra on running headless chromium, and because it slightly incentivises AI scrapers to correctly identify themselves as such.
Most of the AI scraping is frankly just shoddy code written by careless people that don’t want to ddos the independent web, but can’t be bothered to actually fix that on their side.
sudo@programming.dev 6 hours ago
WTF, That’s what I already? That was my entire point from the start!? You don’t need PoW to force headless usage. Any JavaScript challenge will suffice. I even said the Meta Refresh challenge Anubis provides is sufficient and explicitly recommended it.