Comment on Meta admits using pirated books to train AI, but won't pay for it

<- View Parent
SnotFlickerman@lemmy.blahaj.zone ⁨10⁩ ⁨months⁩ ago

nytimes.com/…/openai-new-york-times-lawsuit.html

In its lawsuit Wednesday, the Times accused Microsoft and OpenAI of creating a business model based on “mass copyright infringement,” stating that the companies’ AI systems were “used to create multiple reproductions of The Times’s intellectual property for the purpose of creating the GPT models that exploit and, in many cases, retain large portions of the copyrightable expression contained in those works.”

Publishers are concerned that, with the advent of generative AI chatbots, fewer people will click through to news sites, resulting in shrinking traffic and revenues.

The Times included numerous examples in the suit of instances where GPT-4 produced altered versions of material published by the newspaper.

In one example, the filing shows OpenAI’s software producing almost identical text to a Times article about predatory lending practices in New York City’s taxi industry.

But in OpenAI’s version, GPT-4 excludes a critical piece of context about the sum of money the city made selling taxi medallions and collecting taxes on private sales.

In its suit, the Times said Microsoft and OpenAI’s GPT models “directly compete with Times content.”

If the New York Times’ evidence is true, then you can recreate copyrighted works with LLMs, and as such, they’re doing the same thing as the Pirate Bay, distributing copyrighted works without authorization and making money off the venture.

So far, no ISPs are blocking Meta for this.

source
Sort:hotnewtop