Comment on OpenAI claims The New York Times tricked ChatGPT into copying its articles
TWeaK@lemm.ee 10 months ago
Whether or not they “instructed the model to regurgitate” articles, the fact is it did so, which is still copyright infringement either way.
gmtom@lemmy.world 10 months ago
No, not really. If you use photop to recreate a copyrighted artwork, who is infringing the copyright you or Adobe?
TWeaK@lemm.ee 10 months ago
You are. The person who made or sold a gun isn’t liable for the murder of the person that got shot.
The difference is that ChatGPT is not Photoshop. Photoshop is a tool that a person controls absolutely. ChatGPT is “artificial intelligence”, it does its own “thinking”, it interprets the instructions a user gives it.
Copyright infringement is decided on based on the similarity of the work. That is the established method. That method would be applied here.
OpenAI infringe copyright twice. First, on their training dataset, which they claim is “research” - it is in fact development of a commercial product. Second, their commercial product infringes copyright by producing near-identical work. Even though its dataset doesn’t include the full work of Harry Potter, it still manages to write Harry Potter. If a human did the same thing, even if they honestly and genuinely thought they were presenting original ideas, they would still be guilty. This is no different.
gmtom@lemmy.world 10 months ago
Only if they publish or sell it. Which is why OpenAI isnt/shouldn’t be liable in this case.
If you write out the entire Harry Potter series from memory, you are not breaking any laws just by doing so. Same as if you use photoshop to reproduce a copyright work.
So because they publish the tool, not the actual content openAI isn’t breaking any laws either. It’s much the same way that torrent engines are legal despite what they are used for.
There is also some more direct president for this. There is a website called “library of babel” that has used some clever maths to publish every combination of characters up to 3260 characters long. Which contains, by definition, anything below that limit that is copywritten, and in theory you could piece together the entire Harry Potter series from that website 3k characters at a time. And that is safe under copywrite law.
The same with making a program that generates digital pictures where all the pixels are set randomly. That program, if given enough time /luck will be capable of generating any copyright image, can generate photos of sensitive documents or nudes of celebrities, but is also protected by copyright law, regardless of how closely the products match the copyright material. If the person using the program publishes those pictures, that a different story, much like someone publishing a NYT article generated by GPT would be liable.
TWeaK@lemm.ee 10 months ago
Actually you are infringing copyright. It’s just that a) catching you is very unlikely, and b) there are no damages to make it worthwhile.
You don’t have to be selling things to infringe copyright. Selling makes it worse, and makes it easier to show damages (loss of income), but it isn’t a requirement. Copyright is absolute, if I write something and you copy it you are infringing on my absolute right to dictate how my work is copied.
In any case, OpenAI publishes its answers to whoever is using ChatGPT. If someone asks it something and it spits out someone else’s work, that’s copyright infringement.
It isn’t safe, it’s just not been legally tested. Just because no one has sued for copyright infringement doesn’t mean no infringement has occurred.