The New York Times sues OpenAI and Microsoft for copyright infringement::The New York Times has sued OpenAI and Microsoft for copyright infringement, alleging that the companies’ artificial intelligence technology illegally copied millions of Times articles to train ChatGPT and other services to provide people with information – technology that now competes with the Times.
My question is how is an AI reading a bunch of articles any different from a human doing it. With this logic no one would legally be able to write an article as they are using bits of other peoples work they read that they learnt to write a good article with.
They are both making money with parts of other peoples work.
phoneymouse@lemmy.world 10 months ago
There is something wrong when search and AI companies extract all of the value produced by journalism for themselves. Sites like Reddit and Lemmy also have this issue. I’m not sure what the solution is. I don’t like the idea of a web full of paywalls, but I also don’t like the idea of all the profit going to the ones who didn’t create the product.
kromem@lemmy.world 10 months ago
What’s the value of old journalism?
It’s a product where the value curve is heavily weighted towards recency.
In theory, the greatest value theft is when the AP writes a piece and two dozen other ‘journalists’ copy the thing changing the text just enough not to get sued. Which is completely legal, but what effectively killed investigative journalism.
A LLM taking years old articles and predicting them until it can effectively learn relationships between language itself and events described in those articles isn’t some inherent value theft.
It’s not the training that’s the problem, it’s the application of the models that needs policing.
Like if someone took a LLM, fed it recently published news stories, and had it rewrite them just differently enough that no one needed to visit the original publisher.
Even if we have it legal for humans to do that (which really we might want to revisit, or at least create a special industry specific restriction regarding), maybe we should have different rules for the models.
But to try to claim a LLM that’s allowing coma patients to communicate or to problem solve self-driving algorithms or to diagnose medical issues is stealing the value of old NYT articles in its doing so is not really an argument I see much value in.
jacksilver@lemmy.world 10 months ago
Except no one is claiming that LLMs are the problem, they’re claiming GPT, or more specifically GPTs training data, is the problem. Transformer models still have a lot of potential, but the question the NYT is asking is “can you just takes anyone else’s work to train them”.
ChucklesMacLeroy@lemmy.world 10 months ago
Really gave me a whole new perspective. Thanks for that.
AllonzeeLV@lemmy.world 10 months ago
Should… should we tell him?
kilgore_trout@feddit.it 10 months ago
Tell them instead of mocking them.
Yes, “that’s how the world works”. But doesn’t mean we should stop trying to change it.
Kecessa@sh.itjust.works 10 months ago
The solution is imposing to these companies the responsibility of tracking the profit per media, tax them and redistribute that money based on the tracking info. They’re able to track all the pages you visit, it’s complete bullshit when they say they don’t know how much they make for each places their ads are displayed.
Boiglenoight@lemmy.world 10 months ago
AI training is piracy by another name.
uriel238@lemmy.blahaj.zone 10 months ago
Elaborate. Consumption of copyrighted materials is normal use whether by a human or a machine.
DogWater@lemmy.world 10 months ago
Ai isn’t creating the product. It consumed it.