Comment

what assholes … just fucking download the full package and quit hitting the URL

Sort:hotnew top

cm0002@lemmy.world ⁨6⁩ ⁨months⁩ ago
Right‽ This is ridiculously stupid when you can download the entirety of Wikipedia in a single package and parse it to your hearts desire

source
- TheTechnician27@lemmy.world ⁨6⁩ ⁨months⁩ ago
  Not only that, but we make it goddamn trivial. Doing this is just stealing without attribution like the CC BY-SA 4.0 license demands and then on top of that kicking down the ladder for people who actually want to use Wikipedia and not the hallucinatory slop you’re trying to supplant it with. LLM companies have caused incalculable damage to critical thinking, the open web, and the climate.
  
  source
- ChaoticCookie@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
  Yay interrobang :D
  
  source
Glitchvid@lemmy.world ⁨6⁩ ⁨months⁩ ago
The amount of stupid AI scraping behavior I see even on my small websites is ridiculous, they’ll endlessly pound identical pages as fast as possible over an entire week, apparently not even checking if the contents changed. Probably some vibe coded shit that barely functions.

source
gravitas_deficiency@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
If I was running infra for them, I’d just start blacklisting abusive IPs without warning

source
XTL@sopuli.xyz ⁨6⁩ ⁨months⁩ ago
Scraper bots don’t read instructions, they just follow links

source