Comment on Japanese Government Calls on Sora 2 Maker OpenAI to Refrain From Copyright Infringement, Says Characters From Manga and Anime Are 'Irreplaceable Treasures' That Japan Boasts to the World

tal@lemmy.today ⁨1⁩ ⁨week⁩ ago

So, the “don’t use copyrighted data in a training corpus” crowd probably isn’t going to win the IP argument. And I would be quite surprised if IP law changes to accommodate them.

However, the “don’t generate infringing material” is a whole different story. IP holders are on pretty solid ground there. One thing that I am very certain that IP law is not going to permit is just passing copyrighted data into a model and then generating material that would otherwise be infringing. I understand that anime has something of a tradition of sometimes letting fan-created material slide, but if generative AI massively reduces the bar to creating content, I suspect that that is likely to change.

Right now, you have generative AI companies saying — maybe legally plausibly — that they aren’t the liable ones if a user generates infringing material with their model.

And while you can maybe go after someone who is outright generating and selling material that is infringing, something doesn’t have to be commercially sold to be infringing. Like, if LucasArts wants to block for-fun fan art of Luke and Leia and Han, they can do that.

One issue is attribution. Like, generative AI companies are not lying when they say that there isn’t a great way to just “reverse” what training corpus data contributed more to an output.

However, I am also very confident that it is very possible to do better than they do today. From a purely black-box standpoint, one possibility would be, for example, to use TinEye-style fuzzy hashing of images and then try to reverse an image, probably with a fuzzier hash than TinEye uses, to warn a user that they might be generating an image that would be derivative. That won’t solve all cases, especially if you do 3d vision and generative AI producing models (though then you could also maybe do computer vision and a TinEye-equivalent for 3D models).

Another complicating factor is that copyright only restricts distribution of derivative works. I can make my own, personal art of Leia all I want. What I can’t do is go distribute it. I think — though I don’t absolutely know what case law is like for this, especially internationally — that generating images on hardware at OpenAI or whatever and then having them move to me doesn’t count as distribution. Otherwise, software-as-a-service in general, stuff like Office 365, would have major restrictions on working with IP that locally-running software would not. Point is that I expect that it should be perfectly legal for me to go to an image generator and generate material as long as I do not subsequently redistribute it, even if it would be infringing had I done so. And the AI company involved has no way of knowing what I’m doing with the material that I’m generating. If they block me from making material with Leia, that’s an excessively-broad restriction.

But IP holders are going to want to have a practical route to either be able to go after the generative AI company that gets distributed, or the users generating infringing material and then distributing it. Yeah, they could go after the users before, but if it’s a lot cheaper and easier to create the material now, that presents them with practical problems.

And in that vein, an issue that I haven’t seen come up is what happens if generative AI companies start permitting deterministic generation of content – that is, where if I plug in the same inputs, I get the same outputs. Maybe they already do; I don’t know, run my gen AI stuff locally. But supposing you have a scenario like this:

Now, maybe training the model on images of Star Wars content so that it knows what Star Wars looks like isn’t creating an infringing work. Maybe the distributing the model that knows about Star Wars isn’t infringement. Maybe the prompts being distributed designed to run against that model are not infringing. Maybe reconstituting the apparently-Star-Wars images in a deterministic fashion is not infringing. But if the net effect is equivalent to distributing an infringing work my suspicion is that courts are going to be willing to create some kind of legal doctrine that restricts it, if they haven’t already.

Now, this situation is kind of contrived, but I expect that people will do it, sooner or later, absent legal restrictions.

source
Sort:hotnewtop