I don’t have my glasses on right now but just reading the title, sounds like you might want this
[deleted]
Submitted 5 weeks ago by amateurcrastinator@lemmy.world to selfhosted@lemmy.world
Comments
ryokimball@infosec.pub 5 weeks ago
amateurcrastinator@lemmy.world 5 weeks ago
Thank you I will look into it!
A_norny_mousse@feddit.org 5 weeks ago
Kiwix is probably the easiest / most readily available solution, but wikipedia runs on mediawiki which is also FOSS:
en.wikipedia.org/wiki/Mediawiki
www.mediawiki.org/wiki/…/DownloadI’m guessing that most of the database download options provided by wikipedia itself should be compatible with a personal wikimedia installation:
solrize@lemmy.world 5 weeks ago
The text is in not-exactly-convenient database dumps (see other commenter’s link) and there are daily diffs (mostly bot noise), but then there are the images and other media, which are way up in the terabytes by now. There are some docs, maybe out of date, about how to run the software yourself. It’s written in PHP and it’s big and complicated.
amateurcrastinator@lemmy.world 5 weeks ago
How many terabytes are we talking here? I have about 20tb free at the moment. And I could probably add more if I need. I feel like this should be a real concern for people.
WhatAmLemmy@lemmy.world 5 weeks ago
Last I heard it was way more than feasible for a normie. Something like 500TB.
solrize@lemmy.world 5 weeks ago
I haven’t looked in a few years but 20TB is probably plenty. I agree that Wikipedia lost its way once it got all that attention online and all that search traffic. Everyone should have their own copy of Wikipedia. I used to download the daily incremental data dumps but got tired of it. I still have a few TB of them around that I’ve been wanting to merge.
eskuero@lemmy.fromshado.ws 5 weeks ago
The easiest way by far is downloading an existing dump from kiwix
Per example wikipedia_en_all_nopic_2024-06.zim is only 54GB since it only contains text. Then via docker you could use this compose file where you have your .zim files in the wikis volume:
services: kiwix: image: ghcr.io/kiwix/kiwix-serve container_name: kiwix_app command: '*' ports: - '8080:8080' volumes: - "/wikis:/data" restart: always
Theorically you can actually one of the wikipedia database dumps with mediawiki but I don’t known of any easy plug and play guide
486@lemmy.world 5 weeks ago
The easiest way to do it is by running a Kiwix server and hosting a copy of Wikipedia with that.
A_norny_mousse@feddit.org 5 weeks ago
That site was a little confusing at first!
To be more precise:
486@lemmy.world 5 weeks ago
Correct, you summarized that well.