Comment on AI industry horrified to face largest copyright class action ever certified

<- View Parent
Jason2357@lemmy.ca ⁨2⁩ ⁨days⁩ ago

Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.

I love Cory’s writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it’s laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.

That is that training models on creative works and then selling access to the derivative “creative” works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call “fair use” that hasn’t been really tested in courts.

Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don’t think anyone would argue that is not a derivative work, or that falls under “fair use.” However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is “fair use” to sell. It’s not producing copy-cat literature.

I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under “fair use”, but it’s hard to justify the slop machines as not a copyright problem.

In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won’t help artists and authors.

source
Sort:hotnewtop