Comment on No JS, No CSS, No HTML: online "clubs" celebrate plainer websites
AlteredEgo@lemmy.ml 3 days ago
That is just stupid. How about a slighly more complex markdown.
What I really want is a P2P archive of all the relevant news articles of the last decades in markdown like in firefox “reader view”. And some super advanced LLM powered text compression so you can easily store a copy of 20% of them on your PC to share P2P.
Much of the information on the internet could vanish within months if we face some global economic crisis.
rottingleaf@lemmy.world 3 days ago
Nothing can be that advanced and zstd is good enough.
The idea is cool. With pure p2p exchange being a fallback, and something like trackers in bittorrent being the main center to yield nodes per space (suppose, there’s more than one such archive you’d want to replicate) and per partition (if it’s too big, then maybe it would make sense, but then some of what I wrote further should be reconsidered).
The problem of torrents and other stuff is that people only store what’s interesting to them.
If you have to store one humongous archive, and be able to efficiently search it, and avoid losing pieces - then, I think, you need partitioned roughly equal distribution of it over nodes.
The space of keys (suppose it’s hashes of blocks of the whole) is partitioned by prefix so that a node would store equal amount of blocks of every prefix. And first of all the values closest to the node’s identifier (a bit like in Kademlia) should be stored of those under that space. OK, I’m thinking the first sentence of this paragraph might even be unneeded.
The data itself should probably be in some supercool format where you don’t need to have it all to decompress only the small part you need, just the beginning with the dictionary and some interval.
There should also be, as a separate functionality of this system, search by keywords inside intervals, so that search would yield intervals where a certain keyword is encountered. With nodes indexing continuous intervals they can decompress and responding to search requests by those keywords. Ideally a single block should be possible to decompress having the dictionary. I suppose I should do my reading on compression algorithms and formats.
Probably search function could also involve returning Google-like context. Depending on the space needed.
Would also need some way to reward contribution, that is, to pay a node owner for storing and serving blocks.