Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates
FatCrab@lemmy.one 2 months agoML techniques have been very useful in compression, yes, but it’s sort of nuts to say that a data structure that encodes only (sometimes overly so for certain regions of its latent space/embedding space/semantics space/whatever you want to call it right now) relationships between values rather than value sequences themselves as storing contiguous copyright protected works is storing partiularized creative works in particularly identifiable manner.
GiveMemes@jlai.lu 2 months ago
Except that, again, as is literally written in the comment you’re directly replying to, it has been shown that AI can reproduce copyrightable works word for word, showing that it objectively and necessarily is storing particular creative in a particularly identifiable manner, whether or not that manner is yet known to humans.
Hackworth@lemmy.world 2 months ago
It’s called learning, and I wish people did more of it.
sugar_in_your_tea@sh.itjust.works 2 months ago
You don’t learn by memorizing and reproducing works, you learn by understanding the concepts in various works and producing new works that are combinations of the ideas in those other works. AI doesn’t understand, and it has been shown to be able to reproduce works, so I think it’s fair to say that it’s doing a lot of “memorizing” and therefore plagiarism.
Hackworth@lemmy.world 2 months ago
Calling what attention transformers do memorization is wildly inaccurate.
GiveMemes@jlai.lu 2 months ago
Learning is not being able to reproduce a news article word for word.
FatCrab@lemmy.one 2 months ago
No, it isn’t storing that information in that sequence. What is happening is that it is overly encoding those particular sequential relationships along some arbitrary but tightly mapped semantic concepts represented by dimensions in a massive vector space. It is storing copies of the information on the way that inadvertent copying of music might be based on “memorized” music listened to by the infringing artist in the past.
GiveMemes@jlai.lu 2 months ago
Not what I said. I used the exact language the above commenter used because it was specific and accurate.
FatCrab@lemmy.one 2 months ago
Yes, inadvertent copying is still copying, but it would be copying in the output and is not evidence of copying happening in the creation of the model. That was why I used the music example, because it is rather probative of where there could be grounds for copyright infringement related to these model architectures. This may not seem an important distinction, but it has significant consequences on who is ultimately liable and how.