Comment

Comment on Judge Rules Training AI on Authors' Books Is Legal But Pirating Them Is Not

<- View Parent

LoreleiSankTheShip@lemmy.ml ⁨9⁩ ⁨months⁩ ago

As long as they don’t use exactly the same words in the book, yeah, as I understand it.

source

Sort:hotnew top

vane@lemmy.world ⁨9⁩ ⁨months⁩ ago
How they don’t use same words in the book ? That’s not how LLM works. They use exactly same words if the probabilities align. It’s proved by this study. arxiv.org/abs/2505.12546

source
- nednobbins@lemmy.zip ⁨9⁩ ⁨months⁩ ago
  I’d say there are two issues with it.
  
  FIrst, it’s a very new article with only 3 citations. The authors seem like serious researchers but the paper itself is still in the, “hot off the presses” stage and wouldn’t qualify as “proven” yet.
  
  It also doesn’t exactly say that books are copies. It says that in some models, it’s possible to extract some portions of some texts. They cite “1984” and “Harry Potter” as two books that can be extracted almost entirely, under some circumstances. They also find that, in general, extraction rates are below 1%.
  
  source
  - vane@lemmy.world ⁨9⁩ ⁨months⁩ ago
    Yeah but it’s just a start to reverse the process and prove that there is no AI. We only started with generating text I bet people figure out how to reverse process by using some sort of Rosetta stone. It’s just probabilities after all.
    
    source
    nednobbins@lemmy.zip ⁨9⁩ ⁨months⁩ ago
    That’s possible but it’s not what the authors found.
    
    They spend a fair amount of the conclusion emphasizing how exploratory and ambiguous their findings are. The researchers themselves are very careful to point out that this is not a smoking gun.
    
    source
    -> View More Comments
- SufferingSteve@feddit.nu ⁨9⁩ ⁨months⁩ ago
  The “if” is working overtime in your statement
  
  source