The physics and perspective are still horrible, every video makes me want to vomit my brain
Comment on Google is going ‘all in’ on AI. It’s part of a troubling trend in big tech
auraithx@lemmy.dbzer0.com 3 days agoGoogle just released a video generator that is a ball hair away from perfection. The hallucination rate from their latest models is <1% and dropping you just see cherry picked screenshots.
ParadoxSeahorse@lemmy.world 3 days ago
auraithx@lemmy.dbzer0.com 3 days ago
Only sometimes, with enough generations you can already make indistinguishable videos for the most part. You’re seeing these mistakes because it’s amateurs spending $100 not professionals spending $10k
ParadoxSeahorse@lemmy.world 3 days ago
I was referring to the demos shown at I/O, my brain seems to see them all as “flat”, it’s like a reverse magic eye thing. It reminds me of the uncomfortableness of a fever dream. I may be the exception tbf!
auraithx@lemmy.dbzer0.com 3 days ago
I showed this one to my friend and she said ‘But they faces aren’t AI generated, right?’
www.reddit.com/…/were_cooked_a_zerocost_ai_demo/
There’s a bit at the end where the spaghetti disappears, the chef walks away a bit quick while still speaking, but otherwise it’s nearly flawless.
For scenes with lots of action and complex physics it’s still very noticeable
old.reddit.com/r/…/pushing_veo_3_to_the_limit/
But it’s already good enough to replace several scenes in blockbusters. Dream scenes, cut-aways, etc.
Look at this sausage dog
xcancel.com/nmatares/status/1924931844879134804
It even gets the audio right when it moves between hardwood and carpet.
echodot@feddit.uk 3 days ago
I don’t think image generators are really in the same category though. They’ll have their applications but they’re not going to be a fundamental change to society the way AGI will be.
auraithx@lemmy.dbzer0.com 3 days ago
It’s part of AGI and will be a massive shift. They are to video what punk was to music.
ThirdConsul@lemmy.ml 3 days ago
Agi and image diffusion has literally nothing in common though?
auraithx@lemmy.dbzer0.com 3 days ago
Yes it does. It’s one component of a broader system. The ability to generate helps it interpret. An AGI might use a diffusion model to imagine scenarios, generate visual plans, or process sensory input.