I don’t have my glasses on right now but just reading the title, sounds like you might want this
[deleted]
Submitted 1 week ago by amateurcrastinator@lemmy.world to selfhosted@lemmy.world
Comments
ryokimball@infosec.pub 1 week ago
amateurcrastinator@lemmy.world 1 week ago
Thank you I will look into it!
A_norny_mousse@feddit.org 1 week ago
Kiwix is probably the easiest / most readily available solution, but wikipedia runs on mediawiki which is also FOSS:
en.wikipedia.org/wiki/Mediawiki
www.mediawiki.org/wiki/…/DownloadI’m guessing that most of the database download options provided by wikipedia itself should be compatible with a personal wikimedia installation:
solrize@lemmy.world 1 week ago
The text is in not-exactly-convenient database dumps (see other commenter’s link) and there are daily diffs (mostly bot noise), but then there are the images and other media, which are way up in the terabytes by now. There are some docs, maybe out of date, about how to run the software yourself. It’s written in PHP and it’s big and complicated.
amateurcrastinator@lemmy.world 1 week ago
How many terabytes are we talking here? I have about 20tb free at the moment. And I could probably add more if I need. I feel like this should be a real concern for people.
WhatAmLemmy@lemmy.world 1 week ago
Last I heard it was way more than feasible for a normie. Something like 500TB.
solrize@lemmy.world 1 week ago
I haven’t looked in a few years but 20TB is probably plenty. I agree that Wikipedia lost its way once it got all that attention online and all that search traffic. Everyone should have their own copy of Wikipedia. I used to download the daily incremental data dumps but got tired of it. I still have a few TB of them around that I’ve been wanting to merge.
eskuero@lemmy.fromshado.ws 1 week ago
The easiest way by far is downloading an existing dump from kiwix
Per example wikipedia_en_all_nopic_2024-06.zim is only 54GB since it only contains text. Then via docker you could use this compose file where you have your .zim files in the wikis volume:
services: kiwix: image: ghcr.io/kiwix/kiwix-serve container_name: kiwix_app command: '*' ports: - '8080:8080' volumes: - "/wikis:/data" restart: always
Theorically you can actually one of the wikipedia database dumps with mediawiki but I don’t known of any easy plug and play guide
486@lemmy.world 1 week ago
The easiest way to do it is by running a Kiwix server and hosting a copy of Wikipedia with that.
A_norny_mousse@feddit.org 1 week ago
That site was a little confusing at first!
To be more precise:
486@lemmy.world 1 week ago
Correct, you summarized that well.