Comment on AI’s Unpaid Debt: How LLM Scrapers Destroy the Social Contract of Open Source
yoasif@fedia.io 6 days agoYou can't "train" on code you haven't copied. That is kind of obvious, right? So did they have the right to copy and then reproduce the work without attribution?
melfie@lemy.lol 6 days ago
Yeah, I guess this is a bit of gray area. With GPL, you only have rights to code if it was distributed to you. In the case of GPL code that has only been distributed to select people and none of those people distributed it to the general public, but GitHub still trained their models on the private repo, then that would technically be in violation of the license. This would be a more niche scenario, though, since the intent normally is public distribution.