Comment on Delusions of a Protocol
chilicheeselies@lemmy.world 1 day agoYeah i guess you could look at it that way. Each instance would have to scale horizontally to handle the load, which is waste.
I could be wrong, but my understanding is that ActivityPub is just a rest api contract that one can implement in order to communicate with the rest of the “network”. Its simple, but its such massive overhead to do this all via http. Pushing all your instances events to a dedicated stream and letting the other instnaces read it can be more performant and handle the load better. The downside though is who controls the streams?
IRC is the OG of federation, i am sure we could learn something from it and have federated networks that are in turn federated with eachother. I dunno, just thinking out loud here.
INeedMana@piefed.zip 1 day ago
But is that a limitation of AP?
As far as I understand one could split a fediverse instance into three parts: data, backend and UI.
The data is not shared 1to1 - each instance gets a copy of the activity and from that creates it’s own copy. Hence the same post on different instances will have different id
The problem we are speaking about is the capability of the backend to process incoming copies. Meaning, I also understand that the part that serves the local data to UI should not be the problem
What if there was a queue at the front and from the backend a scalable ingestion worker would be split off? Those would only do the putting the actions onto the data. Probably with per community(?) FIFO topics/partitions, so we can process data in parallel and not worry about an updoot for a post that does not exist yet
Those would still be fairly easy to deploy and be vertically scalable, right?
Or is there some bottleneck in the protocol itself?
chilicheeselies@lemmy.world 1 day ago
So lets say there are 100 instances. My instance needs to issue api requests to each instance to sync with the network. They in turn need to issues 100 requests to me to sync (and eachother). What about when there are 100k instances? Its exponential.
From the looks of AT, its farily linear because its really just operating on a set of giant event streams (like kafka).
To me, ActivityPub being based on REST APIs was always a problem. On the upside it makes it approachable, but its not really the right tech imo. Use something without the overhead of http headers and whatnot.
poVoq@slrpnk.net 21 hours ago
This falsely assumes that everything gets federated to everyone, which isn’t the case for ActivityPub. You only get what you actually subscribe to with it.
INeedMana@piefed.zip 1 day ago
Wouldn’t that mean that this stream will have to scale horizontally?
chilicheeselies@lemmy.world 1 day ago
Yes eventually, just like the instances do once enough users are hitting it. Its a matter of how much all servers in the network need to scale, but also the nature of the protocol itself. Streaming binary data is more performant than individual http api requests for instance. Event streams are the way to go un a decentrliazed network for sure.