Comment on [deleted]
solrize@lemmy.world 1 day ago
The text is in not-exactly-convenient database dumps (see other commenter’s link) and there are daily diffs (mostly bot noise), but then there are the images and other media, which are way up in the terabytes by now. There are some docs, maybe out of date, about how to run the software yourself. It’s written in PHP and it’s big and complicated.
amateurcrastinator@lemmy.world 1 day ago
How many terabytes are we talking here? I have about 20tb free at the moment. And I could probably add more if I need. I feel like this should be a real concern for people.
WhatAmLemmy@lemmy.world 1 day ago
Last I heard it was way more than feasible for a normie. Something like 500TB.
amateurcrastinator@lemmy.world 1 day ago
I suppose I don’t need all the pictures then 😄
solrize@lemmy.world 1 day ago
I haven’t looked in a few years but 20TB is probably plenty. I agree that Wikipedia lost its way once it got all that attention online and all that search traffic. Everyone should have their own copy of Wikipedia. I used to download the daily incremental data dumps but got tired of it. I still have a few TB of them around that I’ve been wanting to merge.