You can scrape Lemmy instances for training data without even running an instance.
jeffhykin@lemm.ee 10 months ago
I think we can * actually give facebook the bad end of the bargin IF and ONLY IF we have a licence/copyright protection.
You know how powerful copy-left was for open source? I think we can do the same for Lemmy servers. We can agree that the data on a particular server cannot be used for training llvm’s advertisements, marketing profiles, etc, and make it legally binding.
Even if we don’t federate with them, they can still harvest the data so we should add these protections regardless. Maybe there is already something like this and I’m just unaware of it
AustralianSimon@lemmy.world 10 months ago
jeffhykin@lemm.ee 10 months ago
Yeah, sorry if I’m not great at communicating it but that’s exactly what I’m trying to point out when I said:
Even if we don’t federate with them, Meta can still harvest the data so we should add these protections regardless.
AustralianSimon@lemmy.world 10 months ago
That’s the thing, anything public is fair game. This is why Reddit is ruining their API.
jeffhykin@lemm.ee 10 months ago
It’s not fair game for for-profit bussinesses training LLM’s. That’s part of why Reddit made the move; so that companies would need to pay Reddit for access to the data for legally training models
Masimatutu@lemm.ee 10 months ago
What does lemmy.world being the biggest have to do with any of this?
jeffhykin@lemm.ee 10 months ago
As opposed to a facebook-controlled server being the top search result for Lemmy.
Masimatutu@lemm.ee 10 months ago
I think this is the wrong take. If we want Lemmy to be truly community-controlled, we need many small servers, as opposed to the current situation of one server controlling half the userbase. Also, which server is Facebook-controlled? Lemmy.world is in the minority by federating with Threads.
pennomi@lemmy.world 10 months ago
Yep, on a public forum like this we lose very little on privacy by federating with them. What we do stand to lose is comment and post quality, but that’s trivial to fix by simply blocking threads on a personal level.