Microsoft, OpenAI sued for copyright infringement by nonfiction book authors in class action claim::The new copyright infringement lawsuit against Microsoft and OpenAI comes a week after The New York Times filed a similar complaint in New York.
I wish the protections placed on corporate control on cultural and intellectual assets were placed on the average persons privacy instead.
Like I really don’t care that someone’s publicly available book and movie in the last century is analysed and used to create tools, but I do care that without people’s actual knowledge a intense surveillance apparatus is being built to collect every minute piece of data about their lives and the lives of those around them to be sold without ethical oversight or consent.
IP is bull, but privacy is a real concern. No one is going to using a extra copy of NY times article to hurt someone, but surveillance is used by authoritarians to oppress and harass innocent people.
bassomitron@lemmy.world 10 months ago
I’m not a huge fan of Microsoft or even OpenAI by any means, but all these lawsuits just seem so… greedy?
It isn’t like ChatGPT is just spewing out the entirety of their works in a single chat. In that context, I fail to see how seeing snippets of said work returned in a Google summary is any different than ChatGPT or any other LLM doing the same.
Should OpenAI and other LLM creators use ethically sources data in the future? Absolutely. But to me, these rich chumps like George R. R. Martin complaining that they felt their data was stolen without their knowledge and profited off of just feels a little ironic.
Welcome to the rest of the 6+ billion people on the Internet who’ve been spied on, data mined, and profited off of by large corps for the last two decades. Maybe regulators should’ve put tougher laws and regulations in place long ago to protect all of us against this sort of shit. It’s not like deep learning models are anything new.
patatahooligan@lemmy.world 10 months ago
You are misrepresenting the issue. The issue here is not if a tool just happens to be able to be used for copyright infringement in the hands of a malicious entity. The issue here is whether LLM outputs are just derivative works of their training data. This is something you cannot compare to tools like pencils and pcs which are much more general purpose and which are not built on stole copyright works. Notice also how AI companies bring up “fair use” in their arguments. This means that they are not arguing that they are not using copryighted works without permission nor that the output of the LLM does not contain any copyrighted part of its training data (they can’t do that because you can’t trace the flow of data through an LLM), but rather that their use of the works is novel enough to be an exception. And that is a really shaky argument when their services are actually not novel at all. In fact they are designing services that are as close as possible to the services provided by the original work creators.
bassomitron@lemmy.world 10 months ago
I disagree and I feel like you’re equally misrepresenting the issue if I must be as well. LLMs can do far more than simply write stories. They can write stories, but that is just one capability among numerous.
I’m not a lawyer or legal expert, I’m just giving a layman’s opinion on a topic. I hope Sam Altman and his merry band get nailed to the wall, I really do. It’s going to be a clusterfuck of endless legal battles for the foreseeable future, especially now that OpenAI isn’t even pretending to be nonprofit anymore.
grue@lemmy.world 10 months ago
If I want to be able to argue that having any copyleft stuff in the training dataset makes all the output copyleft – and I do – then I necessarily have to also side with the rich chumps as a matter of consistency. It’s not ideal, but it can’t be helped. ¯\_(ツ)_/¯
General_Effort@lemmy.world 10 months ago
Wait. I first thought this was sarcasm. Is this sarcasm?
wewbull@feddit.uk 10 months ago
In your mind are the publishers the rich chumps, or Microsoft?
For copyleft to work, copyright needs to be strong.
LWD@lemm.ee 10 months ago
I welcome a lawsuit from any content creator who has enough money to put into it. That benefits all content creators, especially the ones that can’t afford lawyers, from being exploited by giant corporations.
Does anybody think, for a moment, that the average person who creates art as a side job, who lives paycheck to paycheck, should be the one to fight massive plagiaristic megacorporations like OpenAI? That the battle between those who create and those who take should be fought on the most uneven grounds possible?
Womble@lemmy.world 10 months ago
Its wild to me how so many people seem to have got it into their head that cheering for the IP laws that corporations fought so hard for is somehow left wing and sticking up for the little guy.
General_Effort@lemmy.world 10 months ago
Sure. Trickle-down FTW.
CosmoNova@lemmy.world 10 months ago
I hear those kinds of arguments a lot, though usually from the exact same people who claimed nobody would be convicted of fraud for NFT and crypto scams when those were at their peak. The days of the wild west internet are long over.
Theft in the digital space is a very real thing in the eyes of the law, especially when it comes to copyright infringement. It‘s wild to me how many people seem to think Microsoft will just get a freebie here because they helped pioneering a new technology for personal gain. Copyright holders have a very real case here and I‘d argue even a strong one.
Even using user data (that they own legally) for machine learning could get them into trouble in some parts of the developed world because users 10 years ago couldn‘t anticipate it could be used that way and not give their full consent for that.
LWD@lemm.ee 10 months ago
Bit odd how openly hostile to consent all the fans of OpenAI and other mega-corporations are.
General_Effort@lemmy.world 10 months ago
Where, for example?
FreeFacts@sopuli.xyz 10 months ago
Just because it was available for the public internet doesn’t mean it was available legally. Google has a way to remove it from their index when asked, while it seems that OpenAI has no way to do so (or will to do so).
LWD@lemm.ee 10 months ago
The SFWA has actually talked about this: when they made their books more accessible, they became easier to scrape.