An 80 year old judge on their best day couldn’t be trusted to make an informed decision. This guy was either bought or confused into his decision. Old people gotta go.
Judge backs AI firm over use of copyrighted books
Submitted 2 weeks ago by Davriellelouna@lemmy.world to technology@lemmy.world
https://www.bbc.com/news/articles/c77vr00enzyo
Comments
the_q@lemmy.zip 2 weeks ago
AwesomeLowlander@sh.itjust.works 2 weeks ago
Funny, there’s a lot of people on lemmy itself (especially around dbzer0) who would agree with the judge wholeheartedly.
Grimy@lemmy.world 2 weeks ago
80% of the book market is owned by 5 publishing houses.
They want to create a monopoly around AI and kill open source. The copyright industry is not our friend. This is a win, not a loss.
OmegaMouse@pawb.social 2 weeks ago
What, how is this a win? Three authors lost a lawsuit to an AI firm using their works.
Grimy@lemmy.world 2 weeks ago
The lawsuit would not have benefitted their fellow authors but their publishing houses and the big ai companies.
ShittyBeatlesFCPres@lemmy.world 2 weeks ago
I would harm the A.I. industry if Anthropic loses the next part of the trial on whether they pirated books — from what I’ve read, Anthropic and Meta are suspected of getting a lot off torrent sites and the like.
It’s possible they all did some piracy in their mad dash to find training material but Amazon and Google have bookstores and Google even has a book text search engine, Google Scholar, and probably everything else already in its data centers. So, not sure why they’d have to resort to piracy.
sentient_loom@sh.itjust.works 2 weeks ago
How exactly does this benefit “us” ?
gaylord_fartmaster@lemmy.world 2 weeks ago
Because books are used to train both commercial and open source language models?
hendrik@palaver.p3x.de 2 weeks ago
Keep in mind this isn't about open-weight vs other AI models at all. This is about how training data can be collected and used.
bob_omb_battlefield@sh.itjust.works 2 weeks ago
If you aren’t allowed to freely use data for training without a license, then the fear is that only large companies will own enough works or be able to afford licenses to train models.
Grimy@lemmy.world 2 weeks ago
Because of the vast amount of data needed, there will be no competitive viable open source solution if half the data is kept in a walled garden.
This is about open weights vs closed weights.
SonOfAntenora@lemmy.world 2 weeks ago
Cool than, try to do some torrenting out there and don’t hide that. Tell us how it goes
AbouBenAdhem@lemmy.world 2 weeks ago
IMO the focus should have always been on the potential for AI to produce copyright-violating output, not on the method of training.
SculptusPoe@lemmy.world 2 weeks ago
If you try to sell “the new adventures of Doctor Strange, Steven Strange and Magic Man.” existing copyright laws are sufficient and will stop it. Really, training should be regulated by the same laws as reading. If they can get the material through legitimate means it should be fine.
devfuuu@lemmy.world [bot] 2 weeks ago
That “freely” there really does a lot of hard work.
Imgonnatrythis@sh.itjust.works 2 weeks ago
I have a freely accessible document that I have a cc license for that states it is not to be used for commercial use. This is commercial use. Your policy would allow for that document to be used though since it is accessible. This kind of policy discourages me from easily sharing my works as others profit from my efforts and my works are more likely to be attributed to a corporate beast I want nothing to do with then to me.
I’m all for copyright reform and simpler copyright law, but these companies need to be held to standard copyright rules and not just made up modifications. I’m convinced a perfectly decent LLM could be built without violating copyrights.
I’d also be ok sharing works with a not for profit open source LLM and I think others might as well.
kate@lemmy.uhhoh.com 2 weeks ago
as it is already
Copies of copyrighted works cannot be regarded as “stolen property” for the purposes of a prosecution under the National Stolen Property Act of 1934.
Artisian@lemmy.world 2 weeks ago
Plantifs made that argument and the judge shoots it down pretty hard. That competition isn’t what copyright protects from. Would love to hear your thoughts on the ruling (it’s linked by reuters).
Cort@lemmy.world 2 weeks ago
Orcs and dwarves (with a v) are creations of Tolkien, if the fantasy stories include them, it’s a violation of copyright the same as including Mickey mouse.
My argument would have been to ask the ai for the bass line to Queen & David Bowie’s Under Pressure. Then refer to that as a reproduction of copyrighted material. But then again, AI companies probably have better lawyers than vanilla ice.
BlameTheAntifa@lemmy.world 2 weeks ago
MyOpinion@lemmy.today 2 weeks ago
I hate AI with a fire that keeps we warm at night. That is all.
Fingolfinz@lemmy.world 2 weeks ago
Pirate everything!
hendrik@palaver.p3x.de 2 weeks ago
Previous discussion: https://programming.dev/post/32802895
gedaliyah@lemmy.world 2 weeks ago
I’m not pirateing. I’m building my model.
QuadratureSurfer@lemmy.world 2 weeks ago
To anyone who is reading this comment without reading through the article. This ruling doesn’t mean that it’s okay to pirate for building a model. Anthropic will still need to go through trial for that:
Artisian@lemmy.world 2 weeks ago
I also read through the judgement, and I think it’s better for anthropic than you describe. He distinguishes three issues: A) Use any written material they get their hands on to train the model (and the resulting model doesn’t just reproduce the works).
B) Buy a single copy of a print book, scan it, and retain the digital copy for a company library (for all sorts of future purposes).
C) Pirate a book and retain that copy for a company library (for all sorts of future purposes).
A and B were fair use by summary judgement. Manning this judge thinks it’s clear cut in anthropics favor. C will go to trial.