Comment on The surreal joy of having an overprovisioned homelab (2025) - from Anubis creator

tal@lemmy.today ⁨6⁩ ⁨days⁩ ago

What makes this worse is that git servers are the most pathologically vulnerable to the onslaught of doom from modern internet scrapers because remember, they click on every link on every page.

The especially disappointing thing is that, for the specific case that Xe was running into, a better-written scraper could just recognize that this is a public git repository and just git clone the thing. Like, it’s not even “this scraper is scraping data that I don’t want it to have”, but “this scraper is too dumb to just scrape the thing efficiently and is blowing both the scraper’s resources and the server’s resources downloading innumerable redundant copies of the data”.

It’s probably just as well, since the protection is relevant for other websites, and he probably wouldn’t have done it if he hadn’t been getting his git repo hammered, but…

source
Sort:hotnewtop