Comment

Comment on Grisham, Martin join authors suing OpenAI: “There is nothing fair about this”

<- View Parent

livus@kbin.social ⁨1⁩ ⁨year⁩ ago

@DogMuffins

As I understand it, it's not about the output it's about the input.

Same basic principle as why universities don't simply give students a pirated copy of the entire textbook.

source

Sort:hotnew top

Akisamb@programming.dev ⁨1⁩ ⁨year⁩ ago
Same principle why Google can index pretty much all books in existence. They were sued over this and won. Same thing will happen here.

As long as these models are not providing the copyrighted material to their users they should be safe.

source
- livus@kbin.social ⁨1⁩ ⁨year⁩ ago
  @Akisamb yes I think you're right.
  
  That one was really interesting too. It has put a few more limits on its full text since those days, but I don't know how much that was a result of the suit.
  
  The authors from my country tried to have a group lawsuit against Google because within my country, if your books are in public libraries, then you get yearly compensation based on how many copies are in circulation.
  
  But, Google and America are both a lot more rich and powerful than a handful of authors from New Zealand, so I don't know what they though they could achieve.
  
  source
hypelightfly@kbin.social ⁨1⁩ ⁨year⁩ ago
Universities giving away pirated textbooks is output.

source
- livus@kbin.social ⁨1⁩ ⁨year⁩ ago
  @hypelightfly It's input into the students' brains.
  
  source
DogMuffins@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
Hmm… if that were true then why would they be expecting $150k per book?

My understanding us that in copyright infringement the liability is the amount of income you have deprived the author of. If you’ve only copied 1 book and not produced derivative works then the loss is the value of the book.

source
- livus@kbin.social ⁨1⁩ ⁨year⁩ ago
  @DogMuffins
  
  This is why it's such an interesting case. We think of the LLM as one entity, but another way of looking at it is that it's a lot of iterations.
  
  The pirated book text files that are doing the rounds with the AI seem to be multiple iterations as well.
  
  One $150 textbook x 1,000 students = $150k.
  
  Even with public libraries, they pay volume licenses for ebooks based on how many "copies" can be lended simultaneously per year. It's not considered as the library owning one copy when digital files are being lent.
  
  source
  - DogMuffins@discuss.tchncs.de ⁨1⁩ ⁨year⁩ ago
    Sorry mate this is just daft.
    
    Did you read the article? Most of it is about derivative works.
    
    LLMs fuel AI tools that “can spit out derivative works: material that is based on, mimics, summarizes, or paraphrases” their works, allegedly turning their works into “engines of” authors’ “own destruction” by harming the book market for them.
    
    They’re trying to claim that they have been financially harmed by the unauthorised use of their work.
    
    Even if LLMs are separate iterations you could train multiple LLMs with one copy of the book - the library is not loaning multiple copies simultaneously.
    
    source
    livus@kbin.social ⁨1⁩ ⁨year⁩ ago
    @DogMuffins
    
    Correct me if I'm wrong but I was under the impression they don't yet have a derivative work that's close enough to clain plagiarism. Without that, they don't have a leg to stand on.
    
    This is the only part I think could potentially hold water:
    
    Authors should have the right to decide when their works are used to ‘train’ AI. If they choose to opt in, they should be appropriately compensated.”
    
    As for libraries:
    
    the library is not loaning multiple copies simultaneously.
    
    I'm not sure why you're saying that.
    
    With Wheelers' and other big ebook platforms the business model completely depends on the concept of multiple copies. This isn't my opinion, it's just a fact about how they bill libraries.
    
    Even with physical copies libraries sometimes buy several (and in my country publishers get special compensation based on how many copies are in public libraries per year)
    
    source