Comment on [Announcement] Ani.Social-Lemmy.World Federation Isssues
wjs018@ani.social 6 months ago
Alright, I have been doing some poking around the grafana dashboard and noticed that about 20k activities/hour (~ 6 per second) seems to be the limit that ani.social can process coming in from lemmy.world. Whenever the activity peaks on world go over that (generally EU afternoon/NA morning), we start to lag a bit. Then, after the peak has subsided, we catch up.
All this really seems like it is putting a pretty hard limit on how big the fediverse could actually grow without federation becoming completely impossible. I was reading up on efforts that reddthat has undertaken to improve federation from world (since they are in AUS). Their EU-based proxy seems to have worked well, but even with batching like this, federation is always going to be a lot of bandwidth and message passing between servers that just might not scale past a certain point. Anyway, I am off topic.
In any case, the lag seems like it will be coming and going with a bit of regularity, kind of like fediverse tides.
MentalEdge@ani.social 6 months ago
The latency limit is caused by the activity queue that was introduced in v19.
Servers can only talk as fast as round time allows, because Lemmy instances now keep track that each event actually does get federated, and in the right order.
That last point means each event only gets sent once acknowledgement of the last one is received, creating a hard limit for how many events can be communicated, depending on ping. A mere two per second with a latency of 500ms.
This serial process will obviously need to be parallelized. But that’s difficult.