Comment on This new data poisoning tool lets artists fight back against generative AI
vidarh@lemmy.stad.social 1 year agoI doesn’t need to be full on UBI. In a lot of countries grants mechanisms and public purchasing mechanisms for art already make up a significant proportion of income for artists. Especially in smaller countries, this is very common (more so for literary works, movies and music where language provides a significant barrier to accessing a bigger audience, but for other art too). Imagine perhaps a tax/compulsory licensing mechanism that doesn’t stop AI training but instead massively expands those funding sources for people whose data are included in training sets.
This is not stoppable, not least because it’s “too cheap” to buy content outright.
I pointed out elsewhere that e.g. OpenAI could buy all of Getty Images for ~2% of their currently estimated market cap based on a rumoured recent cash infusion. Financing vast amounts of works for hire just creates a moat for smaller players while the big players will still be able to keep improving their models.
As such it will do nothing to protect established artists, so we need expansion of ways to fund artists whether or not inclusion of copyrighted works in training sets becomes restricted.
kayrae_42@lemmy.world 1 year ago
Those grants, and public purchases make up a significant portion of income for established main stream artists. If you work on commission only online, or never went to art school those won’t cover you.
These large tech companies become so highly valued at the start because of venture capital and then in 5-10 years collapse under their own weight. How many of these have come up and are now close to drowning after pushing out all competitors? Sorry if I’m not excited about an infusion of cash into a large for profit company that is just gobbling up anything anyone posts online without consent to make a quick buck.
I’m not against AI. I’m against the ethics of AI at the moment because it’s awful. And AI leans into biases it finds and there are not a lot of oversights on this.
vidarh@lemmy.stad.social 1 year ago
There’s no reason it has to stay like that. And most people in that position are not making a living from art as it is; expanding public funding to cover a large proportion of working artists at a better level than today would cost a pittance.
MS, Apple, Meta, Google etc. are massively profitable. OpenAI is not, but sitting on a huge hoard of Microsoft cash. It doesn’t matter that many are close to drowning. The point is the amount of cash floating around that enable the big tech companies to outright buy more than enough content if they have to means that regulation to prevent them from gobbling up anything anyone posts online without consent will not stop them. So that isn’t a solution. It will stop new entrants with little cash, but not the big ones. And even OpenAI can afford to buy up some of the largest content owners in the world.
The point was not to make you excited about that, but to illustrate that fighting a battle to restrict what they can train on is fighting a battle that the big AI companies won’t care if they lose - they might even be better off if they lose, because if they lose, while they’ll need to pay more money to buy content, they won’t have competition from open models or new startups for a while.
So we need to find other solutions, because whether or not we regulate copyright to training data, these models will continue to improve. The cat is out of the bag, and the computational cost to improving these models keeps dropping. We’re also just a few years away from people being able to train models competitive to present-day models on computers within reach of hobbyists, so even if we were to ban these models outright artists will soon compete with output from them anyway, no matter the legality.
Focusing on the copyright issue is a distraction from focusing on ensuring there is funding for art. One presumes the survival of only one specific model that doesn’t really work very well even today and which is set to fail irrespective of regulation, while the latter opens up the conversation to a much broader set of options and has at least a chance of providing working possibilities.
kayrae_42@lemmy.world 1 year ago
I don’t see these grants or public funding ever covering a private company for one. And for two, I don’t see AI art ever actually getting to the point where it fully replaces artists. As of right now it is good. But it doesn’t understand space or lighting at all. Because of how AI works I’m not sure it ever will. Because it is trained to make a homogeneous rendering of what you are looking for, even if you use a base image, most people have an image that is lit heavily in the front, but because of this it never is able to render shadows correctly. Unless they hire people who are artist or art critics to finely train the data set, which I doubt they will, then the more you look the more uncanny valley the images get. They also have a hard bias in all of their images they generate. Which is difficult to overcome.
AI is an amazing tool, but it is a poor replacement in total. The people who act like it is a total replacement are like the people who in 2015 told us self driving cars were just one year away, and have been saying it every year since. Maybe when quantum computing becomes the standard for every person AI will be able to. But there is just a fundamental misunderstanding of art, artistic process, how art get made people seem to have.
Open AI might be sitting on Microsoft money, but how many other companies has Microsoft gobbled up over the years? Open AI if it starts to struggle will just fall under the Microsoft umbrella and become part of its massive conglomerate, integrated into it. Where are our AR goggles that we are supposed to all be wearing, Microsoft and Google both had those? So many projects grow and die with multiple millions thrown at them. All end up with crazy valuations based on future consumer usage. As we all can’t even afford rent.
There is also this idea that people wouldn’t willing contribute if just asked. The problem is no one has even asked. Hugging Face is an open source distro people willingly contribute to. And so many people upload images to Creative Commons which could be used. I’ve done it with many of my photos which I have no problem being used in a data set, for commercial use even. But my commercial images, no please. The idea that you can’t train smaller models on the vast array of Creative Commons images and public domain, you absolutely can. You can also ask people to contribute to your data set and give credit to them. A lot of people are angry at lack of credit.
There is no reason for any of this to be private enterprise if they are going to blatantly steal copyright images when sources like Creative Commons exists, not give any credit to the people they steal from, and sometime even steal from places they shouldn’t even have access to.
vidarh@lemmy.stad.social 1 year ago
Companies are by far the largest recipients of public funding for art in many countries and sectors. Especially for e.g. movie production in smaller languages, but also in other sectors.
I do agree it won’t fully replace artists, but not because it won’t get to the point where it can be better than everyone, but because a huge part of art is provenance. A “better Mona Lisa” isn’t worth anything, while the original is priceless, not because a “better” one isn’t possible, but because it’s not painted by Da Vinci.
But that will only help an even narrower sliver than the artists who are making good money today.
It will take time, but AI will eat far more fields than art, and we haven’t even started to see the fallout yet.
Diffusion models are not trained “for” anything other than matching vectors to denoising to within your own tolerance levels of matching to what you are looking for. Accordingly, you’ll see a whole swathe of models tuned on more specific types of imagery, and tooling to more precisely control what they generate. The “basic” web interfaces are just scratching the surface of what you can do with e.g. Controlnet and the like. It will take time before they get good enough, sure. They are also only 2 years old, and people have only been working on tooling around then for much less than that.
OpenAI is just one of many in this space already. They are in the lead for LLMs, that is text-based models. But even that lead is rapidly eroding. They don’t have any obvious lead for diffusion models for images. Having used several, it was first with the recent release of DallE 3 that it got “good enough” to be competitive.
At the same time there are now open models getting close enough to be useful, so even if every AI startup in the world collapsed this won’t go away.
That’s fine, but that doesn’t fix the financial challenge.