Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates

<- View Parent
FatCrab@lemmy.one ⁨2⁩ ⁨months⁩ ago

No, this is mostly incorrect, sorry. The commercial aspect of the reproduction is not relevant to whether it is an infringement–it is simply a factor in damages and Fair Use defense (an affirmative defense that presupposes infringement).

What you are getting at when it applies to this particular type of AI is effectively whether it would be a fair use, presupposing there is copying amounting to copyright infringement. And what I am saying is that, ignoring certain stupid behavior like torrenting a shit ton of text to keep a local store of training data, there is no copying happening as a matter of necessity. There may be copying as a matter of stupidity, but it isn’t necessary to the way the technology works.

Now, I know, you’re raging and swearing right now because you think that downloading the data into cache constitutes an unlawful copying–but it presumably does not if it is accessed like any other content on the internet. Because intent is not a part of what makes that a lawful or unlawful copying and once a lawful distribution is made, principles of exhaustion begin to kick in and we start getting into really nuanced areas of IP law that I don’t feel like delving into with my thumbs, but ultimate the point is that it isn’t “basic copyright law.” But if intent is determinitive of whether there is copying in the first place, how does that jive with an actor not making copies for themselves but rather accessing retained data in a third party’s cache after they grab the data for noncommercial purposes? Also, how does that make sense if the model is being trained for purely research purposes? And then perhaps that model is leveraged commercially after development? Your analysis, assuming it’s correct arguendo, leaves far too many outstanding substantive issues to be the ruling approach.

source
Sort:hotnewtop