Your interpretation of copyright law would be helped by reading this piece from an EFF lawyer who has actually litigated copyright cases in the past:
Comment on AI Music Generator Suno Admits It Was Trained on ‘Essentially All Music Files on the Internet’
Hildegarde@lemmy.world 5 months ago
One of the four fair use factors is the portion of the copyrighted work that was taken. For a finding of fair use under this factor, the infringing work must only take the amount of copyrighted material needed for the infringing work’s purpose.
If they ripped every single file they have access to, there’s no way to be found as fair use under this factor. If they argue they were using a curated list of only the works they needed to develop their model it could be fair use, but admitting to taking every possible work in their entirety is a surefire way to fail a fair use defense.
kromem@lemmy.world 5 months ago
Vlyn@lemmy.zip 5 months ago
But that’s not how model training works, it doesn’t simply copy and paste entire songs into its training data. It more or less “listens” to it, analyzes it and when you ask to create a rock song for example it just has an algorithm behind it what a song like that would sound like.
But you can’t just ask it to generate Bohemian Rhapsody from its data, it would probably get very close depending on the training, but it would never be 100% the same (except the model was only trained on this one song).
Just like you can listen to rock songs and then make your own, that’s totally valid. The problem here is of course automation and scale, but saying it’s not fair use is dubious.
Hildegarde@lemmy.world 5 months ago
Fair use is a legal doctrine relating to derivitave works based on copyrighted works. An AI model’s fair use determination would be judged by the same standards and all derivative works.
It doesn’t matter how they used the copyrighted works. This factor is about scale not intent.
There are four factors, and no single factor is determinative. But admitting their model uses as much training as possible makes their model less likely to be fair use.
Vlyn@lemmy.zip 5 months ago
If I as a human listened to every single song of a band from start to finish, then produced a similar song in the same vein (lyrics / music genre), it would be fair use.
So why would it stop being fair use if an AI does the same thing? Just that the AI can listen to every song of this band and a million other bands, combining them.
Hildegarde@lemmy.world 5 months ago
Because fair use is an affirmative defense to copyright infringement. To use a fair use defense you have to admit your work is infringing, but argue that the infringement is justifiable.
Trying to defend the AI with fair use requires you to admit the AI itself is infringing, but justifiable, and by the doctrine of fair use, it is almost certainly not.
Only humans can hold copyrights. Your example would be a non-infringing work because it lacks direct copying. An AI doing the same would make an uncopyrightable work, with the AI itself being infinging if you tried the fair use defense.