but we can reasonably assume that Stable Diffusion can render the image on the right partly because it has stored visual elements from the image on the left.
No, you cannot reasonably assume that. It absolutely did not store the visual elements. What it did, was store some floating point values related to some keywords that the source image had pre-classified. When training, it will increase or decrease those floating point values a small amount when it encounters further images that use those same keywords.
What the examples demonstrate is a lack of diversity in the training set for those very specific keywords. There’s a reason why they chose Stable Diffusion 1.4 and not Stable Diffusion 2.0 (or later versions)… Because they drastically improved the model after that. These sorts of problems (with not-diverse-enough training data) are considered flaws by the very AI researchers creating the models. It’s exactly the type of thing they don’t want to happen!
The article seems to be implying that this is a common problem that happens constantly and that the companies creating these AI models just don’t give a fuck. This is false. It’s flaws like this that leave your model open to attack (and letting competitors figure out your weights; not that it matters with Stable Diffusion since that version is open source), not just copyright lawsuits!
Here’s the part I don’t get: Clearly nobody is distributing copyrighted images by asking AI to do its best to recreate them. When you do this, you end up with severely shitty hack images that nobody wants to look at. Basically, if no one is actually using these images except to say, “aha! My academic research uncovered this tiny flaw in your model that represents an obscure area of AI research!” why TF should anyone care?
They shouldn’t! The only reason why articles like this get any attention at all is because it’s rage bait for AI haters. People who severely hate generative AI will grasp at anything to justify their position. Why? I don’t get it. If you don’t like it, just say you don’t like it! Why do you need to point to absolutely, ridiculously obscure shit like finding a flaw in Stable Diffusion 1.4 (from years ago, before 99% of the world had even heard of generative image AI)?
Generative AI is just the latest way of giving instructions to computers. That’s it! That’s all it is.
Nobody gave a shit about this kind of thing when Star Trek was pretending to do generative AI in the Holodeck. Now that we’ve got he pre-alpha version of that very thing, a lot of extremely vocal haters are freaking TF out.
Do you want the cool shit from Star Trek’s imaginary future or not? This is literally what computer scientists have been dreaming of for decades. It’s here! Have some fun with it!
Generative AI uses up less power/water than streaming YouTube or Netflix (yes, it’s true). So if you’re about to say it’s bad for the environment, I expect you’re just as vocal about streaming video, yeah?
paraphrand@lemmy.world 2 days ago
It’s a very complicated compression algorithm.
cecilkorik@lemmy.ca 2 days ago
It’s so much like watching that Silicon Valley show, but a lot less funny.
baronvonj@lemmy.world 2 days ago
I just tried to go down the stairs 8 steps at a time.
It’s about precision.
Technus@lemmy.zip 2 days ago
It’s glorified autocorrect (/predictive text).
People fight me on this every time I say it but it’s literally doing the same thing just with much further lookbehind.
In fact, there’s probably a paper to be written about how LLMs are just lossily compressed Markov chains.
LadyAutumn@lemmy.blahaj.zone 2 days ago
Kinda. But like, a compression algorithm that isnt all that good at exact decompression. It’s really good at outputting text that makes you think “wow that sounds pretty similar to what a person might write”. So even if it’s entirely wrong about something thats fine, as long as youd look at it and be satisfied its answer sounded right.
leftzero@lemmy.dbzer0.com 2 days ago
It stores the shape of the information, not the information itself.
Which might be useful from a statistics and analytics viewpoint, but isn’t very practical as an information storage mechanism.
Prove_your_argument@piefed.social 2 days ago
Better search results than google though.
Unless it’s a handful of official pages or discussion forums… google is practically unusable for me now. It absolutely exploded once chatgpt came to the scene and SEO has gotten so perfected that slop is almost all the results you get.
I wish we had some kind of downvote or report system to remove all the slop, but the more clicks the more revenue from referrals… better to make people click more.
Honse@lemmy.dbzer0.com 2 days ago
No TF its not. The AI can only output hallucinations that are most statistically likely. There’s no way to sort the bad answers from the good. Google at least supplies a wide range of content to sort through to find the best result.
leftzero@lemmy.dbzer0.com 2 days ago
Because they intentionally broke the search engines in order to make LLMs look better.
Search engines used to produce much more useful results than LLMs ever will, before google and microsoft started pushing this garbage.
then_three_more@lemmy.world 2 days ago
Only because most of the search results are themselves ai generated
CosmoNova@lemmy.world 2 days ago
That‘s what I keep arguing for years. It‘s not so different from printing out frames of a movie, then scanning them again and claim it‘s a completely new art piece. Everything has been altered so much it‘s completely different. However it‘s still very much recognizable with extremely little personal expression involved.
Oh, but you chose the paper and the printer, so it‘s definitely your completely unique work, right? No, of course not.
AI works pretty much the same. You can tell what protected material the LLM was fed by the output of a given prompt. The theft already happened when the model was trained and it‘s not that hard to prove, really.
AI companies get away with the biggest heist in human history by being overwhelming, not by being something completely new and unregulated. Those things are already regulated but being ignored. They have big tech and therefore politics to back them up, but definitely not the written law in any country that protects intellectual property.
Earthman_Jim@lemmy.zip 2 days ago
Complex predictive text that arranged into what is essentially a rorschach that.