I’m not pirateing. I’m building my model.
Judge backs AI firm over use of copyrighted books
Submitted 7 hours ago by Davriellelouna@lemmy.world to technology@lemmy.world
https://www.bbc.com/news/articles/c77vr00enzyo
Comments
gedaliyah@lemmy.world 7 hours ago
QuadratureSurfer@lemmy.world 5 hours ago
To anyone who is reading this comment without reading through the article. This ruling doesn’t mean that it’s okay to pirate for building a model. Anthropic will still need to go through trial for that:
But he rejected Anthropic’s request to dismiss the case, ruling the firm would have to stand trial over its use of pirated copies to build its library of material.
Artisian@lemmy.world 2 hours ago
I also read through the judgement, and I think it’s better for anthropic than you describe. He distinguishes three issues: A) Use any written material they get their hands on to train the model (and the resulting model doesn’t just reproduce the works).
B) Buy a single copy of a print book, scan it, and retain the digital copy for a company library (for all sorts of future purposes).
C) Pirate a book and retain that copy for a company library (for all sorts of future purposes).
A and B were fair use by summary judgement. Manning this judge thinks it’s clear cut in anthropics favor. C will go to trial.
the_q@lemmy.zip 6 hours ago
An 80 year old judge on their best day couldn’t be trusted to make an informed decision. This guy was either bought or confused into his decision. Old people gotta go.
Grimy@lemmy.world 7 hours ago
80% of the book market is owned by 5 publishing houses.
They want to create a monopoly around AI and kill open source. The copyright industry is not our friend. This is a win, not a loss.
OmegaMouse@pawb.social 7 hours ago
What, how is this a win? Three authors lost a lawsuit to an AI firm using their works.
ShittyBeatlesFCPres@lemmy.world 46 minutes ago
I would harm the A.I. industry if Anthropic loses the next part of the trial on whether they pirated books — from what I’ve read, Anthropic and Meta are suspected of getting a lot off torrent sites and the like.
It’s possible they all did some piracy in their mad dash to find training material but Amazon and Google have bookstores and Google even has a book text search engine, Google Scholar, and probably everything else already in its data centers. So, not sure why they’d have to resort to piracy.
Grimy@lemmy.world 3 hours ago
The lawsuit would not have benefitted their fellow authors but their publishing houses and the big ai companies.
sentient_loom@sh.itjust.works 7 hours ago
How exactly does this benefit “us” ?
gaylord_fartmaster@lemmy.world 5 hours ago
Because books are used to train both commercial and open source language models?
hendrik@palaver.p3x.de 7 hours ago
Keep in mind this isn't about open-weight vs other AI models at all. This is about how training data can be collected and used.
bob_omb_battlefield@sh.itjust.works 6 hours ago
If you aren’t allowed to freely use data for training without a license, then the fear is that only large companies will own enough works or be able to afford licenses to train models.
Grimy@lemmy.world 3 hours ago
Because of the vast amount of data needed, there will be no competitive viable open source solution if half the data is kept in a walled garden.
This is about open weights vs closed weights.
SonOfAntenora@lemmy.world 6 hours ago
Cool than, try to do some torrenting out there and don’t hide that. Tell us how it goes
MyOpinion@lemmy.today 3 hours ago
I hate AI with a fire that keeps we warm at night. That is all.
AbouBenAdhem@lemmy.world 6 hours ago
IMO the focus should have always been on the potential for AI to produce copyright-violating output, not on the method of training.
Artisian@lemmy.world 2 hours ago
Plantifs made that argument and the judge shoots it down pretty hard. That competition isn’t what copyright protects from. Would love to hear your thoughts on the ruling (it’s linked by reuters).
SculptusPoe@lemmy.world 6 hours ago
If you try to sell “the new adventures of Doctor Strange, Steven Strange and Magic Man.” existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine.
devfuuu@lemmy.world [bot] 6 hours ago
That “freely” there really does a lot of hard work.
hendrik@palaver.p3x.de 7 hours ago
Previous discussion: https://programming.dev/post/32802895
BlameTheAntifa@lemmy.world 21 minutes ago
Image
Anakin: “Judge backs AI firm over use of copyrighted books” Padme: “But they’ll be held accountable when they reproduce parts of those works or compete with the work they were trained on, right?” Anakin: “…” Padme: “Right?”