Does anyone know if a LLM has been trained on something like scihub?
Comment on Lemmy's List: Downloadable AI, Databases - Critical Knowledge Backup
nix@merv.news 1 year ago
Is it possible to download an archive of scihub?
Siddhartha-Aurelius@kbin.social 1 year ago
PeachMan@lemmy.world 1 year ago
Sci-Hub is ENORMOUS, about 100TB. If you want to help preserve it, you can torrent and seed one of their many 100GB chunks.
elias_griffin@lemmy.world 1 year ago
What a fantastic resource, this is exactly what is needed. I also found about The Standard Template Construct Library:
“Learn about how to access large corpus of high-quality scholarly texts using Python and use them in AI apps”
BolexForSoup@kbin.social 1 year ago
Super cool never knew about this. I got probably 1-2tb I can spare for the effort.