there’s some stuff image generating AI just can’t do yet. it just can’t understand some things. a big problem seems to be referring to the picture itself, like position or its border. another problem is combining things that usually don’t belong together, like a skin of sky. those are things a human artist/designer does with ease.
Comment on DeviantArt’s Downfall Is Devastating, Depressing, and Dumb
haywire7@lemmy.world 5 months ago
AI generated content is great and all but it drowns out everything else on there. Anyone can type a prompt and generate a great looking image with a couple of attempts these days it seems.
The people spending days, weeks, months and more on a piece can’t keep up.
lurch@sh.itjust.works 5 months ago
tal@lemmy.today 5 months ago
there’s some stuff image generating AI just can’t do yet
There’s a lot.
Some of it doesn’t matter for certain things. And some of it you can work around. But try creating something like a graphic novel with Stable Diffusion, and you’re going to quickly run into difficulties. You probably want to display a consistent character from different angles – that’s pretty important. That’s not something that a fundamentally 2D-based generative AI can do well.
admin@lemmy.my-box.dev 5 months ago
I think creating a lora for your character would help in that case. Not really easy to do as of yet, but technically possible, so it’s mostly a ux problem.
tal@lemmy.today 5 months ago
I think creating a lora for your character would help in that case.
A LORA is good for replicating a style, where there’s existing stuff, helps add training data for a particular subject. There are problems that existing generative AIs smack into that that’s good at fixing. But it’s not a cure-all for all limitations of such systems. The problems there is kinda fundamental to how the system works today – it’s not a lack of training data, but simply how the system deals with the world.
The problem is that the LLM-based systems today think of the world as a series of largely-decoupled 2D images, linked only by keywords. A human artist thinks of the world as 3D, can visualize something – maybe using a model to help with perspective – and then render it.
So, okay. If you want to create a facial portrait of a kinda novel character, that’s something that you can do pretty well with AI-based generators.
But now try and render that character you just created from ten different angles, in unique scenes. That’s something that a human is pretty good at:
Like, try reproducing that page in Stable Diffusion, with the same views. Even if you can eventually get something even remotely approximating that, a human, traditional comic artist is going to be a lot faster at it than someone sitting in front of a Stable Diffusion box.
Is it possible to make some form of art generator that can do that? Yeah, maybe. But it’s going to have to have a much more-sophisticated “mental” model of the world, a 3D one, and have solid 3D computer vision to be able to reduce scenes to 3D. And while people are working on it, that has its own extensive set of problems. Look at your training set. The human artist slightly stylized stuff or made errors that human viewers can ignore pretty easily, but a computer vision model that doesn’t work exactly like human vision and the mind might go into conniptions over. For example, look at the fifth panel there. The artist screwed up – the ship slightly overlaps the dock, right above the “THWIP”. A human viewer probably wouldn’t notice or care. But if you have some kind of computer vision system that looks for line intersections to determine relative 3d positioning – something that we do ourselves – it can very easily look at that image and have no idea what the hell is going on there.
The proportions aren’t exactly consistent from frame to frame, don’t perfectly reflect reality, and might be more effective at conveying movement or whatever than an actual rendering of a 3d model would be. That works for human viewers. And existing 2D systems can kind of dodge the problem (as long as they’re willing to live with the limitations that intrinsically come with a 2D model) because they’re looking at a bunch of already-stylized images. But now imagine that they’re trying to take images, then reduce them into a coherent 3D world, then learn to re-apply stylization. That may involve creating not just a 3D model, but enough understanding of the objects in that world to understand what stylization is reasonable on, and when. Is it technically possible? Probably. But is it a minor effort to get there from here? No, probably not. You’re going to have to make a system that works wildly differently from the way that the existing systems do. That’s even though what you’re trying to do might seem small from the standpoint of a human observer – just being able to get arbitrary camera angles of the image being rendered.
hagelslager@feddit.nl 5 months ago
I think Corridor Digital made an AI animated film by hiring an illustrator (after an earlier attempt with a general dataset) and “draw” still frames from video of the lead actors, with Stable Diffusion generating the inbetweens.
anlumo@lemmy.world 5 months ago
It‘s even hard to impossible to generate the image of a person doing a handstand. All models assume a rightside-up person.
Even_Adder@lemmy.dbzer0.com 5 months ago
This hasn’t been true for months at least. You really have to check week to week when dealing with things in this field.
100@fedia.io 5 months ago
think of an episode of any animated series with countless handmade backgrounds, good luck generating those with any sort of consistency or accuracy and you will be calling for an artist who can actually take instructions and iterate
yildolw@lemmy.world 5 months ago
We’ll soon be hearing that only Luddites care about continuity errors
admin@lemmy.my-box.dev 5 months ago
Almost 10 years old now, more relevant than ever: Humans need not apply.
Thorny_Insight@lemm.ee 5 months ago
I’m a bit surprised about how quickly I got tired of seeing AI content (mostly porn and non-nudes) Somehow it all just looks the same. You’d think that being AI generated would give you infinite variety but apparently not.
istanbullu@lemmy.ml 5 months ago
The same way people using shovels can’t keep up with an excavator.
Technology changes the world. This is nothing new.
Soundhole@lemm.ee 5 months ago
People spending that much time one their work can and should create things in meat space.
hellothere@sh.itjust.works 5 months ago
It’s almost like low quality mechanisation is something that should be resisted. I wonder where I’ve heard that before…
the_crotch@sh.itjust.works 5 months ago
You heard it from traditional artists when the camera was invented
makyo@lemmy.world 5 months ago
And photographers when Photoshop was invented
DarkDarkHouse@lemmy.sdf.org 5 months ago
And birthed impressionism as a result. These are tools, artists will adapt.
yildolw@lemmy.world 5 months ago
Every gallery in the world did not rush out to exhibit every submitted photograph with no curation or quality filter when photography was invented
GBU_28@lemm.ee 5 months ago
A physical gallery has limited wall space. A website does not. Ai art should just be tagged as such, so it can be filtered
the_crotch@sh.itjust.works 5 months ago
If youre implying that every gallery in the world is rushing to exhibit every submitted ai picture with no curation or quality filter, name 5.
Even_Adder@lemmy.dbzer0.com 5 months ago
― Charles Baudelaire, On Photography, from The Salon of 1859
Ultragramps@lemmy.blahaj.zone 5 months ago
Image
FiniteBanjo@lemmy.today 5 months ago
TBF he was kind of right, if you look at the industry of wall art these days then 98% of whats on people’s walls is printed imagery and copies. Imagine if we paid a real artist directly for every one of those framed and hung works instead of giving profit to some soulless corporation to make monotony incarnate.
TheBat@lemmy.world 5 months ago
yOuRe GaTeKePiNg!!!
Kusimulkku@lemm.ee 5 months ago
I don’t know for what product that’d be desirable. What did you have in mind?