A New York Times copyright lawsuit could kill OpenAI

Submitted ⁨⁨1⁩ ⁨year⁩ ago⁩ by ⁨L4s@lemmy.world [bot]⁩ to ⁨technology@lemmy.world⁩

https://www.vox.com/technology/2024/1/18/24041598/openai-new-york-times-copyright-lawsuit-napster-google-sony

A New York Times copyright lawsuit could kill OpenAI::A list of authors and entertainers are also suing the tech company for damages that could total in the billions.

source

Comments

Sort:hotnew top

makyo@lemmy.world ⁨1⁩ ⁨year⁩ ago
I always say this when this comes up because I really believe it’s the right solution - any generative AI built with unlicensed and/or public works should then be free for the public to use.

If they want to charge for access that’s fine but they should have to go about securing legal rights first. If that’s impossible, they should worry about profits some other way like maybe add-ons such as internet connected AI and so forth.

source
- dasgoat@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Running AI isn’t free, and AI calculations pollute like a motherfucker
  
  This isn’t me saying you’re wrong on an ethical or judicial standpoint, because on those I agree. It’s just that, on a practical level considerations have to be made.
  
  For me, those considerations alone (and a ton of other considerations such as digital slavery, child porn etc) make me just want to pull the plug already.
  
  AI was fun. It’s a dumb idea for dumb buzzword spewing silicon valley ghouls. Pull the plug and be done with it.
  
  source
  - seliaste@lemmy.blahaj.zone ⁨1⁩ ⁨year⁩ ago
    The thing is that those models aren’t even open source, if it was then you could argue that openai’s business model is renting processing power. Except they’re not so their business model is effectively selling models trained on copyrighted data
    
    source
    -> View More Comments
- fidodo@lemmy.world ⁨1⁩ ⁨year⁩ ago
  There’s plenty of money to be made providing infrastructure. Lots of companies make a ton of money providing infrastructure for open source projects.
  
  On another note, why is open AI even called “open”?
  
  source
  - ItsMeSpez@lemmy.world ⁨1⁩ ⁨year⁩ ago
    
    On another note, why is open AI even called “open”?
    
    It’s because of the implication…
    
    source
- Pacmanlives@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Not really how it works these days. Look at Uber and Lime/Bird scooters. They basically would just show up to a city and say the hell with the law we are starting our business here. We just call it disruptive technology
  
  source
  - makyo@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Unfortunately true, and the long arm of the law, at least in the business world, isn’t really that long. Would love to see some monopoly busting to scare a few of these big companies into shape.
    
    source
- miridius@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Nice idea but how do you propose they pay for the billions of dollars it costs to train and then run said model?
  
  source
  - nexusband@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Then don’t do it. Simple as that.
    
    source
    -> View More Comments
  - Smoogs@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Defending scamming as a businesss model is not a business model.
    
    source
- Drewelite@lemmynsfw.com ⁨1⁩ ⁨year⁩ ago
  A very compelling solution! Allows a model of free use while providing an avenue for business to spend time developing it
  
  source
- canihasaccount@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Would you, after devoting full years of your adult life to the unpaid work of learning the requisite advanced math and computer science needed to develop such a model, like to spend years more of your life to develop a generative AI model without compensation? Within the US, it is legal to use public text for commercial purposes without any need to obtain a permit. Developers of such models deserve to be paid, just like any other workers, and that doesn’t happen unless either we make AI a utility (or something similar) and funnel tax dollars into it or the company charges for the product so it can pay its employees.
  
  I wholeheartedly agree that AI shouldn’t be trained on copyrighted, private, or any other works outside of the public domain. I think that OpenAI’s use of nonpublic material was illegal and unethical, and that they should be legally obligated to scrap their entire model and train another one from legal material. But developers deserve to be paid for their labor and time, and that requires the company that employs them to make money somehow.
  
  source
  - thecrotch@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    
    Would you, after devoting full years of your adult life to the unpaid work of learning the requisite advanced math and computer science needed to develop such a model, like to spend years more of your life to develop a generative AI model without compensation?
    
    No. I wouldn’t want to write a kernel from scratch for free either. But Linus Torvalds did. He even found a way to monetize it without breaking any laws.
    
    source
  - adrian783@lemmy.world ⁨1⁩ ⁨year⁩ ago
    then openai should close its doors
    
    source
- ExLisper@linux.community ⁨1⁩ ⁨year⁩ ago
  Also anything produced with solar power should be free.
  
  source
  - Prok@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Yes, good point, resource collection is nearly identical to content generation
    
    source
- poopkins@lemmy.world ⁨1⁩ ⁨year⁩ ago
  What is unlicensed work? Copyrighted content will not have a licence agreement but this doesn’t mean you can freely infringe on copyright law.
  
  source
  - makyo@lemmy.world ⁨1⁩ ⁨year⁩ ago
    By unlicensed I mean works that haven’t been licensed IE anything being used without permission or some other right
    
    source
  - h3rm17@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    Stuff lile public domain books, I guess, like alice in wonderland, and cc0 content
    
    source
    -> View More Comments
kjPhfeYsEkWyhoxaxjGgRfnj@lemmy.world ⁨1⁩ ⁨year⁩ ago
I doubt it. It would likely kill any non Giant tech backed AI companies though

source
charonn0@startrek.website ⁨1⁩ ⁨year⁩ ago
If OpenAI owns a Copyright on the output of their LLMs, then I side with the NYT.

If the output is public domain–that is you or I could use it commercially without OpenAI’s permission–then I side with OpenAI.

Sort of like how a spell checker works. The dictionary is Copyrighted, the spell check software is Copyrighted, but using it on your document doesn’t grant the spell check vendor any Copyright over it.

I think this strikes a reasonable balance between creators’ IP rights, AI companies’ interest in expansion, and the public interest in having these tools at our disposal. So, in my scheme, either creators get a royalty, or the LLM company doesn’t get to Copyright the outputs. I could even see different AI companies going down different paths and offering different kinds of service based on that distinction.

source
- tabular@lemmy.world ⁨1⁩ ⁨year⁩ ago
  I want people to take my code if they share their changes (gpl). Taking and not giving back is just free labor.
  
  source
- Grimy@lemmy.world ⁨1⁩ ⁨year⁩ ago
  I think it currently resides with the one doing the generation and not openAI itself. Officially it is a bit unclear.
  
  Hopefully, all gens become copyleft just for the fact that ais tend to repeat themselves. Specific faces will pop up quite often in image gen for example.
  
  source
- gram_cracker@lemmynsfw.com ⁨1⁩ ⁨year⁩ ago
  If LLMs like ChatGPT are allowed to produce non-copyrighted work after being trained on copyrighted work, you can effectively use them to launder copyright, which would be equivalent to abolishing it at the limit.
  
  A much more elegant and equitable solution would be to just abolish copyright outright. It’s the natural direction of a country that chooses to invest in LLMs anyways.
  
  source
SatanicNotMessianic@lemmy.ml ⁨1⁩ ⁨year⁩ ago
The NYT has a market cap of about $8B. MSFT has a market cap of about $3T. MSFT could take a controlling interest I. The Times for the change it finds in the couch cushions. I’m betting a good chunk of the c-suites of the interested parties have higher personal net worths than the NYT has in market cap.

I have mixed feelings about how generative models are built and used. I have mixed feelings about IP laws. I think there needs to be a distinction between academic research and for-profit applications. I don’t know how to bring the laws into alignment on all of those things.

But I do know that the interested parties who are developing generative models for commercial use, in addition to making their models available for academics and non-commercial applications, could well afford to properly compensate companies for their training data.

source
- LWD@lemm.ee ⁨1⁩ ⁨year⁩ ago
  The danger of the rich and evil simply buying out their critics is a genuine risk. After all, it’s what happened to Gawker when Peter Thiel decided he personally didn’t like them, neutering their entire network.
  
  Regarding OpenAI the corporation, they pulled an incredibly successful bait and switch, pretending first to gather data for educational purposes, and then switching to being a for-profit as soon as it benefited them. In a better world or even a slightly more functional American democracy, their continued existence would be deemed inexcusable.
  
  source
  - ripcord@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Or Musk when he decided he didn’t like what people were saying on Twitter.
    
    source
  - SatanicNotMessianic@lemmy.ml ⁨1⁩ ⁨year⁩ ago
    I completely agree. I don’t want them to buy out the NYT, and I would rather move back to the laws that prevented over-consolidation of the media. I think that Sinclair and the consolidated talk radio networks represent a very real source of danger to democracy. I think we should legally restrict the number of markets a particular broadcast company can be in, and I also believe that we can and should come up with an argument that’s the equivalent of the Fairness Doctrine that doesn’t rest on something as physical and mundane as the public airwaves.
    
    source
db2@lemmy.world ⁨1⁩ ⁨year⁩ ago
Oh no, how terrible. What ever will we do without Shenanigans Inc. 🙄

source
800XL@lemmy.world ⁨1⁩ ⁨year⁩ ago
YES! AI is cool I guess, but the massive AI circlejerk is so irritating though.

If OpenAI can infringe upon all the copyrighted material on the net then the internet can use everything of theirs all for free too.

source
Grimy@lemmy.world ⁨1⁩ ⁨year⁩ ago
This would bring up the cost of entry for making a model and nothing more. OpenAI will buy the data if they have too and so will google. The money will only go to the owners of the new York Times and its shareholders, none of the journalists who will be let go in the coming years will see a dime.

We must keep the entry into the AI game as low as possible or the only two players will be Microsoft and Google. And as our economy becomes increasingly AI driven, this will cement them owning it.

Pragmatism or slavery, these are the two options.

source
- Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  LWD@lemm.ee is deleting his comments and reposting the same comment to dodge replies. Link to the last thread.
  
  Image
  
  source
- LWD@lemm.ee ⁨1⁩ ⁨year⁩ ago
  [deleted]
  source
  - Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    He’s not arguing for OpenAI, but for the rest of us. AI is a public technology, but we’re on the verge of losing our ability to participate due to things like and the megacorp’s attempts at regulatory capture. Which they might just get. Their campaign against AI is a lot like their attempts to destroy encryption. Support open source development, It’s our only chance. Their AI will never work for us. John Carmack put it best.
    
    Image
    
    Fuck "Open"AI, fuck, Microsoft. Pragmatism or slavery.
    
    source
    -> View More Comments
- LWD@lemm.ee ⁨1⁩ ⁨year⁩ ago
  I was wondering when the OpenAI evangelists were going to show up. The last time you talked about OpenAI, you played devil’s advocate for it.
  
  And this time… You’re playing devil’s advocate for them again.
  
  source
- LWD@lemm.ee ⁨1⁩ ⁨year⁩ ago
  [deleted]
  source
  - Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
    Did you delete your last people replied to repost it again without replies? Link to the last thread.
    
    Image
    
    source
    -> View More Comments
sin_free_for_00_days@sopuli.xyz ⁨1⁩ ⁨year⁩ ago
Oh no. Anyways.

source
airportline@lemmy.ml ⁨1⁩ ⁨year⁩ ago
inshallah

source
sugarfree@lemmy.world ⁨1⁩ ⁨year⁩ ago
We hold ourselves back for no reason. This stuff doesn’t matter, AI is the future and however we get there is totally fine with me.

source
- Zaderade@lemmy.world ⁨1⁩ ⁨year⁩ ago
  AI without proper regulation could be the downfall of humanity. Many pros, but the cons may outweigh them. Opinion.
  
  source
  - sugarfree@lemmy.world ⁨1⁩ ⁨year⁩ ago
    AI development will not be hamstrung by regulations. If governments want to “regulate” (aka kill) AI, then AI development in their jurisdiction will move elsewhere.
    
    source
    -> View More Comments
milkjug@lemmy.wildfyre.dev ⁨1⁩ ⁨year⁩ ago
Don’t threaten me with a good time!

source
Daxtron2@startrek.website ⁨1⁩ ⁨year⁩ ago
Oh great more Lemmy anti technology circlejerking

source
BeautifulMind@lemmy.world ⁨1⁩ ⁨year⁩ ago
Is there a possible way that both the NYT and OpenAI could lose?

source
tonytins@pawb.social ⁨1⁩ ⁨year⁩ ago
The problem with copyright is that everything is automatically copyrighted. The copyright logo is purely symbolic, at this point. Both side are technically right, even though the courts have ruled that anything an AI outputs is actually in the public domain.

source
- Even_Adder@lemmy.dbzer0.com ⁨1⁩ ⁨year⁩ ago
  Works involving the use of AI are copyrightable. Also, the Copyright Office’s guidance isn’t law. Their guidance reflects only the office’s interpretation based on its experience, it isn’t binding in the courts or other parties. Guidance from the office is not a substitute for legal advice, and it does not create any rights or obligations for anyone. They are the lowest rung on the ladder for deciding what law means.
  
  source
  - tonytins@pawb.social ⁨1⁩ ⁨year⁩ ago
    I wasn’t talking about Copyright Office. I was talking about the courts.
    
    source
    -> View More Comments
kaitco@lemmy.world ⁨1⁩ ⁨year⁩ ago
I never thought that the AI-driven apocalypse could be impeded by a simple lawsuit. And, yet, here we are.

source
- maynarkh@feddit.nl ⁨1⁩ ⁨year⁩ ago
  One has to wonder why in Star Trek the Federation did not simply sue the Borg.
  
  source
  - BearOfaTime@lemm.ee ⁨1⁩ ⁨year⁩ ago
    Hahahahahahaha hahahahahahaha omg, thank you for the very real, actual laugh-out-loud moment.
    
    Now I’m envisioning Picard one one side, Borq Queen on the other, and what, Q as judge, looking older by the minute, just hating life.
    
    source
  - kaitco@lemmy.world ⁨1⁩ ⁨year⁩ ago
    Well, that comes down to the particular venue. Who’s going to rule? The Kardassians??
    
    source
autotldr@lemmings.world [bot] ⁨1⁩ ⁨year⁩ ago
This is the best summary I could come up with:

Late last year, the New York Times sued OpenAI and Microsoft, alleging that the companies are stealing its copyrighted content to train their large language models and then profiting off of it.

Meanwhile, the Senate Judiciary Subcommittee on Privacy, Technology, and Law held a hearing in which news executives implored lawmakers to force AI companies to pay publishers for using their content.

In its rebuttal, OpenAI said that regurgitation is a “rare bug” that the company is “working to drive to zero.” It also claims that the Times “intentionally manipulated prompts” to get this to happen and “cherry-picked their examples from many attempts.”

A growing list of authors and entertainers have been filing lawsuits since ChatGPT made its splashy debut in the fall of 2022, accusing these companies of copying their works in order to train their models.

Developers have sued OpenAI and Microsoft for allegedly stealing software code, while Getty Images is embroiled in a lawsuit against Stability AI, the makers of image-generating model Stable Diffusion, over its copyrighted photos.

In that 2013 decision, Judge Chin said its technology “advances the progress of the arts and sciences, while maintaining respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders.” And a 2023 economics study of the effects of Google Books found that “digitization significantly boosts the demand for physical versions” and “allows independent publishers to introduce new editions for existing books, further increasing sales.” So consider that another point in favor of giving tech platforms room to innovate.

The original article contains 1,628 words, the summary contains 259 words. Saved 84%. I’m a bot and I’m open source!

source
GilgameshCatBeard@lemmy.ca ⁨1⁩ ⁨year⁩ ago
Fingers crossed!

source
mashbooq@infosec.pub ⁨1⁩ ⁨year⁩ ago
good

source