Only the third most confusing entry in the Kingdom Hearts series
OpenAI introduces Sora, its text-to-video AI model
Submitted 9 months ago by catculation@lemmy.zip to technology@lemmy.world
https://www.theverge.com/2024/2/15/24074151/openai-sora-text-to-video-ai
Comments
MermaidsGarden@lemmy.world 9 months ago
FoolHen@lemmy.world 8 months ago
Lol And KH4 is gonna be about Sora being in the real world. This storyline is getting out of hand.
jownz@lemmy.world 8 months ago
The folks with access to this must be looking at some absolutely fantastic porn right now!
webghost0101@sopuli.xyz 8 months ago
Oh its going to be fantastic all right.
Fantastical chimera monster porn, at least for the beginning.
dylanTheDeveloper@lemmy.world 8 months ago
‘obama giving birth’
myxi@feddit.nl 8 months ago
I don’t think they would make a model like this uncensored.
helpImTrappedOnline@lemmy.world 8 months ago
Honestly, let’s make it mainstream. Get it to a point where it’s more profitable to mass produce Ai porn than exploit young women from god knows where.
echo64@lemmy.world 9 months ago
Would be good if openai could focus on things that are useful to humanity rather than trying to just do what we can do already, but with less jobs.
maniacal_gaff@lemmy.world 8 months ago
We already knew how to farm before John Deere; should we have focused away from agricultural industrialization in order to preserve jobs?
echo64@lemmy.world 8 months ago
looks at the immense harm that agricultural industrialization has had on the climate, the environment and society
Apparently yes.
Wanderer@lemm.ee 8 months ago
Working less is a great ideal for humanity.
Americans have this thing that their job defines them but we worked less than we did before, let’s keep going.
lorty@lemmy.ml 8 months ago
Except the gains technology and automation bring are rarely evenly distributed in society. Just compare how productive a worker is today and how much we make compared to 50 years ago.
echo64@lemmy.world 8 months ago
1 Generally people want to work, people don’t want to be exploited by capitolists for a capitolist society where they barely make rent humans are generally workers. 2. This isn’t working less, this isn’t productivity improvement. This is less humanity in art and all just so employers don’t need to spend money on workers.
CluckN@lemmy.world 8 months ago
Why pursue any of the arts if they do not benefit humanity?
Harbinger01173430@lemmy.world 8 months ago
Because they look good enough for the web stories or RP I make
fidodo@lemmy.world 8 months ago
If the natural state of technology is that there aren’t enough jobs to sustain an economy, then our economic system is broken, and trying to preserve obsolete jobs is just preserving the broken status quo that primarily benefits the rich. Over time I’m thinking more and more that instead of trying to prop up an outdated economic system we should just let it fail, and then we have no choice but to rethink it.
echo64@lemmy.world 8 months ago
Oh yes yes I’m sure that we will totally rethink our economic systems that’s absolutely what will happen and it will totally result in the utopia you’re dreaming of. I’m sure that will happen I’m sure it’s not just the ultra wealthy noting how they can make even more profit whilst everyone else suffers can’t be that I’m sure the government will do something we all have faith in that we know it’s obvious that will happen
ndr@lemmy.world 9 months ago
This is so much better than all text-to-video models currently available. I’m looking forward to read the paper but I’m afraid they won’t say much about how they did this. Even if the examples are cherry picked, this is mind blowing!
BetaDoggo_@lemmy.world 9 months ago
I’m looking forward to reading the paper
You mean the 100 page technical report
PerogiBoi@lemmy.ca 8 months ago
Just get ChatGPT to summarize it. Big brain time.
UndercoverUlrikHD@programming.dev 8 months ago
Looking forward to the day I can just copy paste the Silmarillion into a program and have it spit out a 20 hour long movie.
platypus_plumba@lemmy.world 8 months ago
I was thinking exactly this but with the Bible. Not because I like the Bible but because I’d love to see how AI interprets one of the most important books in human history.
But yeha, the Silmarillion is basically a Bible from another universe.
msage@programming.dev 8 months ago
Which is why christians are scared of them. It will open people’s eyes to how anyone can write a fairytale. And so much better ones, too.
paulzy@lemmy.world 8 months ago
I wonder if in the 1800s people saw the first photograph and thought… “well, that’s the end of painters.” Others probably said “look! it’s so shitty it can’t even reproduce colors!!!”.
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.
I have worked with hundreds dread of software developers in the last 20 years, half of them were copy pasters who got into software because they tricked people into thinking it was magic. In the future we will still code, just don’t bother with the thing the Prompt Engineer can do in 5 seconds.
InvaderDJ@lemmy.world 8 months ago
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art. That is where software development is going.
I think a better way of saying this are people who were just doing it for a job, not because of a lot of talent or passion for painting.
But doing something just because it is a job is what a lot of people have to do to survive. Not everyone can have a profession that they love and have a passion for.
That’s where the problem comes in when it comes to these generative AI.
Kedly@lemm.ee 8 months ago
And then the problem here is capitalism and NOT AI art. The capitalists are ALWAYS looking for ways to not pay us, if it wasnt AI art, it was always going to be something else
systemglitch@lemmy.world 8 months ago
I think that’s a bad analogy because of the whole being able to think part.
General_Effort@lemmy.world 8 months ago
It was exactly the same as with AI art. The same histrionics about the end of art and the dangers to society. It’s really embarrassing how unoriginal all this is.
Charles Baudelaire, father of modern art criticism, in 1859:
As the photographic industry was the refuge of every would-be painter, every painter too ill-endowed or too lazy to complete his studies, this universal infatuation bore not only the mark of a blindness, an imbecility, but had also the air of a vengeance. I do not believe, or at least I do not wish to believe, in the absolute success of such a brutish conspiracy, in which, as in all others, one finds both fools and knaves; but I am convinced that the ill-applied developments of photography, like all other purely material developments of progress, have contributed much to the impoverishment of the French artistic genius, which is already so scarce.
What it was the end of was talentless painters who were just copying what they saw. Painting stopped being for service and started being for art.
This attitude is not new, either. He addressed it thus:
I know very well that some people will retort, “The disease which you have just been diagnosing is a disease of imbeciles. What man worthy of the name of artist, and what true connoisseur, has ever confused art with industry?” I know it; and yet I will ask them in my turn if they believe in the contagion of good and evil, in the action of the mass on individuals, and in the involuntary, forced obedience of the individual to the mass.
fidodo@lemmy.world 8 months ago
The hardest part of coding is managing the project, not writing the content of one function. By the time LLMs can do that it’s not just programming jobs that will be obsolete, it will be all office jobs.
Flumpkin@slrpnk.net 8 months ago
This is still so bizarre to me. I’ve worked on 3D rendering engines trying to create realistic lighting and even the most advanced 3D games are pretty artificial. And now all of a sudden this stuff is just BAM super realistic. Not just that, but as a game designer you could create an entire game by writing text and some logic.
ArmokGoB@lemmy.dbzer0.com 8 months ago
In my experience as a game designer, the code that LLMs spit out is pretty shit. It won’t even compile half the time, and when it does, it won’t do what you want without significant changes.
DSTGU@sopuli.xyz 8 months ago
The correct usage of LLMs in coding is either for a single use case at a time, building up to what you need from scratch. It requires skill both in talking to AI for it to give you what you want, knowing how to build up to it, reading the code it spits out so that you know when it goes south and the skill of actually knowing how to build the bigger picture software from little pieces but if you are an intermediate dev who is stuck on something it is a great help.
That or for rubber ducky debugging, it s also great in that
kspatlas@lemm.ee 8 months ago
Chatgpt once insisted my JSON was actually YAML
nucleative@lemmy.world 8 months ago
Welcome to the club my friend… Expert after expert is having this experience as AI develops in the past couple years and we discover that the job can be automated way more than we thought.
First it was the customer service chat agents. Then it was the writers. Then it was the programmers. Then it was the graphic design artists. Now it’s the animators.
EnderMB@lemmy.world 8 months ago
Another programmer here. The bottleneck in most jobs isn’t in getting boilerplate out, which is where AI excels, it’s in that first and/or last 10-20%, alongside dictating what patterns are suitable for your problem, what proprietary tooling you’ll need to use, what API’s you’re hitting and what has changed in recent weeks/months.
What AI is achieving is impressive, but as someone that works in AI, I think that we’re seeing a two-fold problem: we’re seeing a limit of what these models can accomplish with their training data, and we’re seeing employers hedge their bets on weaker output with AI over specialist workers.
The former is a great problem, because this tooling could be adjusted to make workers lives far easier/faster, in the same way that many tools have done so already. The latter is a huge problem, as in many skilled worker industries we’ve seen waves of layoffs, and years of enshitification resulting in poorer products.
The latter is also where I think we’ll see a huge change in culture. IMO, we’ll see existing companies bet it all and die from supporting AI over people, and a new wave of companies focus on putting output of a certain standard to take on larger companies.
HeavyDogFeet@lemmy.world 8 months ago
Writer here, absolutely not having this experience. Generative AI tools are bad at writing, but people generally have a pretty low bar for what they think is good enough.
These things are great if you care about tech demos and not quality of output. If you actually need the end result to be good though, you’re gonna be waiting a while.
Traister101@lemmy.today 8 months ago
Still waiting on the programmer part. In a nutshell AI being say 90% perfect means you have 90% working code IE 10% broken code. Images and video (but not sound) is way easier cause human eyes kinda just suck. Couple of the videos they’ve released pass even at a pretty long glance. You only notice funny businesses once you look closer.
Flumpkin@slrpnk.net 8 months ago
Yeah. And it’s not just how good the images look it’s also the creativity. Everyone tries to downplay this but I’ve read texts and those videos and just from the prompts there is a “creative spark” there. It’s not very bright spark lol but it’s there.
I should get into this stuff but I feel old lol. I imagine you could generate interesting levels with obstacles and riddles and “story beats” too.
General_Effort@lemmy.world 8 months ago
I can’t imagine that digital artists/animators have reason to worry. At the upper end, animated movies will simply get flashier, eating up all the productivity gains. In live action, more effects will be pure CGI. At the bottom end, we may see productions hiring VFX artists, just as naturally as they hire makeup artists now.
When something becomes cheaper, people buy more of it, until their demand is satisfied. With food, we are well past that point. I don’t think we are anywhere near that point with visual effects.
genesis@kbin.social 8 months ago
It seems to me that AI won't completely replace jobs (but will do in 10-20 years). But will reduce demand because oversaturation + ultraproductivity with AI. Moreover, AI will continue to improve. A work of a team of 30 people will be done with just 3 people.
FatCrab@lemmy.one 8 months ago
Keep in mind that this isn’t creating 3d Billy volumes at all. While immensely impressive, the thing being created by this architecture is a series of 2d frames.
fidodo@lemmy.world 8 months ago
Because it’s trained on videos of the real world, not on 3d renderings.
Flumpkin@slrpnk.net 8 months ago
Lol you don’t know how cruel that is. For decades programmers have devoted their passion to creating hyperrealistic games and 3D graphics in general, and now poof it’s here like with a magic wand and people say “yeah well you should have made your 3D engine look like the real world, not to look like shit” :D
gravitas_deficiency@sh.itjust.works 8 months ago
Ah yes, this definitely won’t have any negative ramifications.
/s
JoeKrogan@lemmy.world 8 months ago
Besides the few glitches ones I wouldn’t be able to tell they were generated. I didn’t expect it this quick.
At least we can remake the last three star wars movies with a decent story line.
tiredofsametab@kbin.run 8 months ago
If you read Japanese, it's really obvious the Tokyo one is AI; the signage largely makes no sense, has incorrect characters, has weird mixing of characters, etc. Even the
KingJalopy@lemm.ee 8 months ago
Someone wrote a decent story line for those??
davidgro@lemmy.world 8 months ago
Back to ChatGPT for that.
JoeKrogan@lemmy.world 8 months ago
These was books out for years that Disney just didn’t bother with. They can’t be worse than what we got.
Flumpkin@slrpnk.net 8 months ago
There are tons of books. Afaik the main storyline was an extragalactic invasion by a super evil swarm. Also explains why the emperor build so many ships.
fidodo@lemmy.world 8 months ago
Unpopular opinion, but I actually liked the high level story of it, but I think it could have been told way way better.
sndrtj@feddit.nl 8 months ago
The mammoth one is uncanny valley for me.
EdibleFriend@lemmy.world 9 months ago
redcalcium@lemmy.institute 9 months ago
If this goes well, future video compression might take a massive leap. Imagine downloading 2 hours movies with just 20kb file size because it just a bunch of prompts under the hood.
draxil@lemmy.world 9 months ago
This would be the most GPU intensive compression algorithm of all time :)
lea@feddit.de 8 months ago
And the largest ever decoder since it’ll need the whole model to work. I’m not particularly knowledgeable on AI but I’ll assume this will occupy hundreds of gigabytes, correct me if I’m wrong there. In comparison, libdav1d, an av1 decoder, weighs less than 2 MB.
r00ty@kbin.life 9 months ago
If you randomize the seed it'll be a different render of the movie every time.
KingJalopy@lemm.ee 8 months ago
" but you haven’t seen the ultimate limited edition fan version action cut of the directors cut"
CluckN@lemmy.world 9 months ago
Sounds like you already saw Madam Web
tiny_electron@sh.itjust.works 9 months ago
The quality is really superior to what was shown with Lumiere. Even if this is cherry picking it seems miles above the competiton
ndr@lemmy.world 9 months ago
barsoap@lemm.ee 8 months ago
The second one is easy as you don’t need coherence between reflected and non-reflected stuff: Only the reflection is visible. The second one has lots of inconsistencies: I works kinda well if the reflected thing and reflection are close together in the image, it does tend to copy over uniformly-coloured tall lights, but OTOH it also invents completely new things.
Do people notice? Well, it depends. People do notice screen-space reflections being off in traditional rendering pipelines, not always, but it happens and those AI reflections are the same kind of “mostly there in most situations but let’s cheap out to make it computationally feasible” type of deal: Ultimately processing information, tracking influence of one piece of data throughout the whole scene, comes with a minimum amount of required computational complexity and neither AI nor SSR do it.
tiny_electron@sh.itjust.works 8 months ago
Yeah we won’t be needing proper raytracing with this kind of tech it’s mind blowing
sleepmode@lemmy.world 8 months ago
After seeing the horrific stuff my demented friends have made dall-e barf out I’m excited and afraid at the same time.
Carighan@lemmy.world 8 months ago
The example videos are both impressive (insofar that they exist) and dreadful. Two-legged horses everywhere, lots of random half-human-half-horse hybrids, walls change materials constantly, etc.
It really feels like all this does is generate 60 DALL-E images per second and little else.
archomrade@midwest.social 8 months ago
For the limitations visual AI tends to have, this is still better than what I’ve seen. Objects and subjects seem pretty stable from Frame to Frame, even if those objects are quite nightmarish
I think “will Smith eating spaghetti” was only like a year ago
Natanael@slrpnk.net 8 months ago
This would work very well with a text adventure game, though. A lot of them are already set in fantasy worlds with cosmic horrors everywhere, so this would fit well to animate what’s happening in the game
Theharpyeagle@lemmy.world 8 months ago
I mean, it took a couple months for AI to mostly figure out that hand situation. Video is, I’d assume, a different beast, but I can’t imagine it won’t improve almost as fast.
fidodo@lemmy.world 8 months ago
It will get better, but in the mean time you just manually tell the AI to try again or adjust your prompt. I don’t get the negativity about it not being perfect right off the bat. When the magic wand tool originally came out, it had tons of jagged edges. That didn’t make it useless, it just meant it did a good chunk of the work for you and you just needed to manually get it the rest of the way there. With stable diffusion if I get a bad hand you just inpaint and regenerate it again until it’s fixed. If you don’t get the composition you want, just generate parts of the scene, combine it in an image editor, then have it use it as a base image to generate on top of.
They’re showing you the raw output to show off the capabilities of the base model. In practice you would review the output and manually fix anything that’s broken. Sure you’ll get people too lazy to even do that, but non lazy people will be able to do really impressive things with this even in its current state.
ThePowerOfGeek@lemmy.world 8 months ago
YouTube is about to get flooded by the weirdest meme videos. We thought it was bad already, we ain’t seen nothing yet.
just_another_person@lemmy.world 9 months ago
Who’s benefiting from this? Why is this even a fucking thing?
Vex_Detrause@lemmy.ca 8 months ago
Imagine VR giving an AI generated world. It would be a Ready Player One in irl.
4grams@awful.systems 9 months ago
Loos good but still has the ai hallmarks, rotating legs, f’ed up gait… impressive though and it’s going be wild to see what results from this latest pox on the tubes.
anguo@lemmy.ca 8 months ago
Her legs rotate around themselves and flip sides at 16s in. It’s still very impressive, but …yeah.
dylanTheDeveloper@lemmy.world 8 months ago
Shit posting 2.0 is here fellas
1984@lemmy.today 9 months ago
I’m really impressed by the demo, but yes, let’s see how well it works when it’s made public.
People who don’t think AI will take a lot of jobs may have to rethink…
jeena@jemmy.jeena.net 9 months ago
The cat video is funny, the cat has 5 legs :D
autotldr@lemmings.world [bot] 9 months ago
This is the best summary I could come up with:
Sora is capable of creating “complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” according to OpenAI’s introductory blog post.
The company also notes that the model can understand how objects “exist in the physical world,” as well as “accurately interpret props and generate compelling characters that express vibrant emotions.”
Many have some telltale signs of AI — like a suspiciously moving floor in a video of a museum — and OpenAI says the model “may struggle with accurately simulating the physics of a complex scene,” but the results are overall pretty impressive.
A couple of years ago, it was text-to-image generators like Midjourney that were at the forefront of models’ ability to turn words into images.
But recently, video has begun to improve at a remarkable pace: companies like Runway and Pika have shown impressive text-to-video models of their own, and Google’s Lumiere figures to be one of OpenAI’s primary competitors in this space, too.
It notes that the existing model might not accurately simulate the physics of a complex scene and may not properly interpret certain instances of cause and effect.
The original article contains 395 words, the summary contains 190 words. Saved 52%. I’m a bot and I’m open source!
nyakojiru@lemmy.dbzer0.com 8 months ago
shit is going too far, as excited expected, and governments give a fuck about societies. Only in the EU, there are a few human-like movements.
eagles_fan@discuss.tchncs.de 8 months ago
This can only be bad for artists and if you are happy about it you are a fascist
MonkderZweite@feddit.ch 8 months ago
I’m pretty sure that’s a model tho.
ton618@lemm.ee 9 months ago
The demo looks pretty good, yes - but I won’t believe it 'till I try it!
catherine@lemmy.world 8 months ago
Imgonnatrythis@sh.itjust.works 8 months ago
I know people have been scared by new technology since technology, but I’ve never before fallen into that camp until now. I have to admit, this really does frighten me.
Plopp@lemmy.world 8 months ago
Boo! Image
catastrophicblues@lemmy.ca 8 months ago
What’s wild to me is how Yann LeCun doesn’t seem to see this as an issue at all. Many other leading researchers (Yoshua Bengio, Geoffrey Hinton, Frank Hutter, etc.) signed that letter on the threats of AI and LeCun just posts on Twitter and talks about how we’ll just “not build” potentially harmful AI. Really makes me lose trust in anything else he says.
Thorny_Insight@lemm.ee 8 months ago
There with you. This is really worrying to me. This technology is advancing way faster than were adjusting to it. I haven’t even gotten over how amazing GPT2.5 is but most people already seem to be taking it for granted. We didn’t have anything even close to this just few years prior
fidodo@lemmy.world 8 months ago
To make that statement a little more accurate, I’m afraid of the humans that will abuse this technology and societies ability to adapt to it. There’s some amazingly cool things that can come about from this, like all the small indie creators that lack the connections and project management skills to make their ambitions come to life will be able to achieve their vision, and that’s really cool and I’m excited for that, but my excitement is smashed from knowing all the bad that will come with this.