The tricky bit is recognizing that the requests are all from the same source. Often they use different IP addresses and to even classify requests at all you have to keep extra state around that you wouldn’t need without this anti-social behavior.
Comment on Open Source devs say AI crawlers dominate traffic, forcing blocks on entire countries
Goun@lemmy.ml 2 weeks ago
What if we start throtling them so we make them waste time? Like, we could throttle contiguous requests, so if anyone is hitting the server aggresively they’d get slowed down.
taladar@sh.itjust.works 2 weeks ago
WhyJiffie@sh.itjust.works 2 weeks ago
tal@lemmy.today 2 weeks ago
They can just interleave requests to different hosts. Honestly, someone spidering the whole Web probably should be doing that regardless.