Can you explain for me what do you think “backfill” means in the context of the linked post?
Comment on Bluesky has started honoring takedown requests from Turkish government
73ms@sopuli.xyz 3 days agoYour “example of self hosting” is not an example of self hosting the relay, just an appview which is still being fully dependent of other Bluesky services like the relay. It’s pretty unlikely that the relay would be at all practical to host on a RPi5. But even if it was the problem still remains that the network is set up in a way where self-hosting it only results in you creating your own separate bubble, not meaningfully participating in the official one.
I also doubt anyone has selfhosted relays long-term since right now there’s very little purpose to that and the resource requirements are massive as well as keep growing at a fast pace in terms of the disk space required.
aeshna_cyanea@lemm.ee 3 days ago
73ms@sopuli.xyz 3 days ago
I have zero need to play games with you. Make your case if you have one.
sp3ctr4l@lemmy.dbzer0.com 3 days ago
Backfill means that the AppView has to request and download and then be able to present… the entire history of all posts from everyone on BlueSky.
If you are familiar with crypto, its like how you have to download either the entire blockchain, or nowadays, a trimmed down/compressed version of it… before you can interact with it.
If you are familiar with any kind of database like a forum or something… when migrating, you have to actually import a copy of all the preexisting users, posts, forum structure, posts, etc… if you want the new forum to actually contain what the old forum did.
When this rando is setting up his own AppView… he is asking the BlueSky Relays to give his AppView all the older posts, before the AppView is caught up, and can then begin to function in realtime with the rest of the network.
aeshna_cyanea@lemm.ee 2 days ago
yeah that was my bad I might be a little stupid.
I got confused between the relay crawling pdses to store their records for retransmission vs an appview storing records for serving them to clients. Both involve processing similar (v large) amounts of data, but the latter is actually more expensive because you’re also transforming the data to make it useful to clients.
They have said that in production, their appview takes up about 30 servers while the relay takes only one. Apologies for the rude tone of the post bsky.app/profile/why.bsky.team/…/3lku2o3n6es22
Natanael@infosec.pub 3 days ago
The whole architecture is built around content addressing and allowing every account hosting server (PDS) talk to multiple relays and to allowing mirroring.
The whole point is to NOT create bubbles.
People already run their own PDS servers and participate with the official bluesky network, and can talk to users there, because their self hosted PDS syncs to the bluesky relay.
If you run your own relay and appview it STILL works, and you can talk without bubbles, if you still link your PDS to the bluesky relay to make yourself visible to their users, and if you set your appview / relay to retrieve content from the bluesky relay then you see content from bluesky users too.
Self hosted relays do exist, they’re just not open to the public (mostly used for archival / development currently)
sp3ctr4l@lemmy.dbzer0.com 3 days ago
Blorgbob exists, and people use their own blorgbobs, but also people are not allowed to use blorgbobs, and they are only in archives or experimental development.
… Please tell me you understand you have just said completely self contradictory nonsense.
Leaving the actual truth or falsity of your claim aside… what you have just stated is a logically impossible paradox.
Natanael@infosec.pub 3 days ago
Choosing to not understand the architecture is your failure, not mine
sp3ctr4l@lemmy.dbzer0.com 3 days ago
Ok, noted, you are a fanatic who does not understand that the statement of yours I replied to literally is a logically impossible paradox.
Take a few deep breaths and … maybe try to reformulate your words.
73ms@sopuli.xyz 3 days ago
PDS is not very significant, it’s just a tiny piece of the puzzle and doesn’t really prove anything about the architecture. See this for more on what I’m getting at: neuromatch.social/@jonny/113365406995624763
Natanael@infosec.pub 3 days ago
That post is very misguided.
First of all, he’s saying “you SHOULD make your PDS invisible to the bluesky servers because otherwise what’s the point”, but that’s exactly equivalent to saying “our community want it’s own Mastodon server - that means we MUST defederate Mastodon.social or what’s the point?”
That’s nonsense. Don’t enforce silos on people.
Also, which relays to support are not chosen by users, it’s chosen by the services the users choose. The PDS choose which relays to sync to, the appview does too, just like feed generators and moderation labelers does.
Also moderation labelers can be shared.
Hosting a PDS is very cheap, it’s just storage and bandwidth for the posts multiplied by the number of relays you directly sync to. With a few users on each that’s nothing. It’s in the range of free tier VPS hosting, RPi grade.
Deduplicating is probably the most trivial part. There’s already code for handling duplicate events in streams. But more practically speaking, there’s algorithms like set reconciliation which can make it significantly more bandwidth efficient to subscribe to multiple relays even when they have overlapping content.
73ms@sopuli.xyz 3 days ago
I don’t think you got the point tbh. It isn’t about wanting to separate but about how dependent you are on Bluesky Corp. in every other scenario (and how hard it would be to deal with the situation if they decide to go rogue).