Comment on Server Maintenance - Feb 14 at 22:00 UTC
wjs018@ani.social 1 week agoDamn AI scraper bots (probably)! Thanks for the backend work!
Out of curiosity, are they just hitting the lemmy-ui frontend and not hitting the others?
Comment on Server Maintenance - Feb 14 at 22:00 UTC
wjs018@ani.social 1 week agoDamn AI scraper bots (probably)! Thanks for the backend work!
Out of curiosity, are they just hitting the lemmy-ui frontend and not hitting the others?
hitagi@ani.social 1 week ago
They only seem to be scraping the lemmy-ui frontend and not the others. I’m actually not 100% sure if it’s scraping but this is what it looks like:
Image
All from the same IP range in Singapore. A huge spike in requests matches the time when Lemmy-UI crashed last night.
I captcha’d the ASN (it’s Alibaba btw) for now (I don’t know if that’s okay to do, hopefully I’m not blocking real people) but perhaps there are better solutions like ratelimiting.
Anyway, let me know if you’re still having trouble accessing the site. The filter should only cover that particular ASN for now.
SatouKazuma@ani.social 1 week ago
Whoa. Was there anything else going down in that time frame? I’d be interested to have some profile on the attacker. Do you have any information about where the majority of ani.social’s traffic came from before that?
hitagi@ani.social 1 week ago
Nothing else went down. After blocking the ASN, I found out that they’ve been scraping the site from Hong Kong too but not as aggressively with Singapore. Or rather, the Hong Kong servers are more consistent but not as frequent (I don’t know how best to describe it lol). The only info I have is the ASN which points to the Alibaba Group. It also doesn’t disclose what it is other than “Microsoft Edge Windows” which is not very nice in my opinion.
Cloudflare only gives me logs for the past 24 hours but most of ani.social’s traffic comes from other instances federating in (unsurprisingly, Lemmy.World is #1).
Some graphs if you're interested:
Image The green line is Singapore. This happened a few hours after the crash the other night. Image This is a closer look of the spike. The lines are 5 of their IP addresses. You can see that they’re dormant then suddenly they start hammering the instance with requests. Image These are the top countries for the past 6 hours that use GET after filtering out Alibaba. United States is high because they have a lot of web crawlers/scrapers like Open AI (at least they’re honest) but I might block the AI ones too.
SatouKazuma@ani.social 1 week ago
I’d almost be concerned that it’s a nation state at that point.