Comment on Archive.today CAPTCHA page executes DDoS; Wikipedia considers banning site
Strawberry@sh.itjust.works 1 day ago
I think the future of wikipedia looks a bit bleak if they drop archive.today now. They need a decent archiver to function. Internet archive is good but its a single group hosted in the US, plus any site with a paywall isn’t surviving on the internet archive very well.
They’ve needed good alternative for awhile and the need is just growing. I wish public libraries could fill the gap but its probably not realistic. We’ve had legal deposit requirements for non-print media in various jurisdictions for awhile but i’m doubtful how effective it is, nor is it convenient to access or use for wikipedia.
onehundredsixtynine@sh.itjust.works 1 day ago
To be fair Wayback Machine is not the only option, there are at least 3 other Internet archival services:
Unfortunately their scrapers are nearly not as developed as Wayback Machine’s and archive.today’s are (Ghostarchive and Megalodon can’t bypass Anubis/Cloudflare check, for example). Ghostarchive is neat when it works because of very high-fidelity captures (even more high-fidelity than archive.today captures are). Unfortunately only something like ~75% of everything I’ve ever archived there works. Oh, and it can also archive short (<10 min) YouTube videos with low/average bitrate.
Megalodon is pretty much useless for Wikipedia because it doesn’t work with, like, 1/2 of all online news websites.
I haven’t archived anything on Etched yet, but their premise of “archiving a web page forever on bitcoin” doesn’t seem attractive so I probably won’t use it.
Strawberry@sh.itjust.works 1 day ago
Very True, I have had some good use out of ghostarchive. When it works. There’s also self-hosted options like archivebox. And Several paid solutions like perma.cc. Kiwix/Zim too although that’s focused on wiki’s themselves & offline storage/access so not as useful for sources. But yes I’ve found none get consistantly good archives as much as archive.org or archive.today.
I have not heard of etched, but I do tend to avoid a lot of the crypto stuff.
Its also concerning if any of the archives suddenly going down & the data isn’t backed up. I know the storage requirements alone makes good backups unlikely, but with archive.today looking so volitile I wonder if a