Comment on Is my domain "burnt" when hosting my first Fediverse technology?
sugar_in_your_tea@sh.itjust.works 2 days agoYeah, I really don’t get the “everything stores a copy of everything” model. It should instead work like a cache, where the OG instance is the source of truth, and instances just keep a cache of that data. Instances should be able to refresh data, or have no cache at all.
I get the desire to not lose data, but that comes with a huge storage cost. If we want redundancy, we should have dedicated caches instead of everything having a copy.
But hey, the Fediverse exists and I’m too lazy to build something better, so here I am.
squaresinger@lemmy.world 2 days ago
Yeah, with the current “everything replicates everything” model Lemmy is close to the workable limit of users.
Currently, there are roughly 50k monthly active users on Lemmy, and the hosting cost is approaching unsustainability for hobby instances with a decent amount of users. Monetizing Lemmy is close to impossible with donations being the only real revenue stream, so there’s pretty much no business case for anything but a hobby instance.
If for some reason even just 1% of Reddit users were to migrate to Lemmy (that would be ~10mio monthly active users) Lemmy would instantly crumble under that load no matter how many new instances would be added since every instance stores everything.
Moderation would also all but collapse since each instance needs to moderate everything as well, due to legal reasons. (If someone posts something illegal on a remote instance and it gets replicated to your instance and you don’t delete it, you are legally liable for it since it’s stored on your server.
Max_P@lemmy.max-p.me 1 day ago
Technically it wasn’t really designed with megainstances in mind that swallows the entire fediverse.
My instance has no problem whatsoever keeping up and storage is well under control. But we’re few here subscribed to a subset of available communities so my instance isn’t 90% filled with content I don’t care about and will never look at. Also reduces the moderation burden because it’s slow enough I can actually mostly see everything that comes through.
Lemmy itself is also pretty inefficient in that regard, you can very much make software that pulls instead and backfill local cache as needed.
Even my Reddit subscriptions would be pretty easy on my instance.
squaresinger@lemmy.world 1 day ago
You need really small instances for that to do something. The issue here is not only mega instances, but more curcially mega communities.
If people on your instance subscribe to the top 50 communities you already have more than 50% of the whole lemmy traffic on your instance. And 50 subscriptions isn’t all that much for even a single user.
And mega communities is kinda the whole point of any reddit-like service. The really cool thing about reddit is that no matter how obscure the topic, there’s a subreddit for it with experts in the field. Lemmy is still lacking that for most topics, but that would be where a real Reddit alternative would want to end up.
If you have a look at reddit, they have over 1000 subreddits with over a million subscribers each. Every single one of these subreddits has around 200x the traffic of all of Lemmy combined. So if Lemmy were to grow to Reddit levels and a single user subscribes to a single community like that, your whole instance is cooked.
sugar_in_your_tea@sh.itjust.works 2 days ago
Yup, and this has been my chief concern since I came to Lemmy 2-3 years ago and read the implementation details. And it’s not something that can be patched in easily or I’d work on it, it’s a fundamental design choice.
I began working on a distributed alternative, but quickly ran into issues in the design phase that Plebbit is currently running into: moderation is a tough nut to crack. I have ideas on how to mostly solve it, but between a full-time job and young kids, it just hasn’t been a priority.
I hope someone with more time than me can tackle it, especially since I’m not 100% confident in my own solution.
squaresinger@lemmy.world 2 days ago
It’s a worthy and huge endeavor.
I think the only somewhat sustainable way to get around the moderation problem is to get rid of storing a copy of everything.
That way you don’t have incriminating data on your server and then you just mark every external community as “out of bounds of my moderation, there be dragons, go at your own risk” and call it a day.
If a Lemmy/ActivityPub alternative was designed from the ground up a decent option would be to limit federation to a single-signon and private messages. In that case when you visit a remote community, the client directly goes to that remote community to fetch data from there.
Basically like a set of separate forums with a federated login.
That would solve the “everything copies everything” issue and the “everyone has to moderate everything” issue as well. If someone posts illegal crap on a remote instance, that data stays on that remote instance and you aren’t responsible for them. And the users can themselves decide what communities on what instances fit to what they want to look at.
That would mean that if an instance goes down their communities do as well, but that’s (at least to me) less of an issue than the current state. It’s not like these zombie communities work fine right now. With the source instance being down, federation is gone and thus posting on these instances means there’s only a fraction of the audience left.
sugar_in_your_tea@sh.itjust.works 1 day ago
But then you completely lose content when someone disables their account, and most people don’t want to host anyway.
My approach is a P2P service that’s similar to that, but instead of storing your stuff, you store some amount of other peoples’ stuff, such that there are multiple copies of any piece of content distributed randomly across the globe.
However, moderation gets tricky here. I think a transitive trust system can work well. Basically, Alice trusts Bob to some degree, Bob trusts Carol to some degree, so Alice trusts Carol to some lesser degree. If content falls below some trust level, Alice doesn’t see it or store it. Bob and Carol don’t even need to be people, there can be bots to detect things like CSAM and other illegal content.
The net result is that everyone’s experience is tailored to them, which hopefully makes things like shilling, trolling, and astroturfing less prevalent for those who curate their trust network more effectively. And this curation doesn’t need to be manual, it can be automatic based on how you react to content. In other words, everyone is a moderator, and you trust people who moderate similarly to you.
The intent here is to solve a bunch of different problems:
There are certainly issues, such as:
And some interesting side effects:
Given the downsides, I’m not completely convinced it’s worth it, hence the hesitation. But anything that requires users to have a publicly facing server is DOA, so this seems like the most approachable option.