Wow it’s almost like the computer scientists were saying this from the start but were shouted over by marketing teams.
Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.
Submitted 5 weeks ago by Allah@lemm.ee to technology@lemmy.world
Comments
SoftestSapphic@lemmy.world 5 weeks ago
zbk@lemmy.ca 5 weeks ago
This! Capitalism is going to be the end of us all. OpenAI has gotten away with IP Theft, disinformation regarding AI and maybe even murder of their whistle blower.
technocrit@lemmy.dbzer0.com 5 weeks ago
It’s hard to to be heard when you’re buried under all that sweet VC/grant money.
aidan@lemmy.world 5 weeks ago
And engineers who stood to make a lot of money
BlushedPotatoPlayers@sopuli.xyz 5 weeks ago
For me it kinda went the other way, I’m almost convinced that human intelligence is the same pattern repeating, just more general (yet)
raspberriesareyummy@lemmy.world 5 weeks ago
Except that wouldn’t explain conscience. There’s absolutely no need for conscience or an illusion(*) of conscience. Yet we have it.
- arguably, conscience can by definition not be an illusion. We either perceive “ourselves” or we don’t
minoscopede@lemmy.world 5 weeks ago
I see a lot of misunderstandings in the comments 🫤
This is a pretty important finding for researchers, and it’s not obvious by any means. This finding is not showing a problem with LLMs’ abilities in general. The issue they discovered is more likely that the training is not right, specifically for so-called “reasoning models” that iterate on their answer before replying.
Most reasoning models are not incentivized to think correctly, and are only rewarded based on their final answer. This research might indicate that’s a flaw that needs to be corrected. If so, that opens the door for experimentation on more rigorous training processes that could lead to more capable models that actually do “reason”.
Knock_Knock_Lemmy_In@lemmy.world 5 weeks ago
When given explicit instructions to follow models failed because they had not seen similar instructions before.
This paper shows that there is no reasoning in LLMs at all, just extended pattern matching.
MangoCats@feddit.it 5 weeks ago
I’m not trained or paid to reason, I am trained and paid to follow established corporate procedures. On rare occasions my input is sought to improve those procedures, but the vast majority of my time is spent executing tasks governed by a body of (not quite complete, sometimes conflicting) procedural instructions.
If AI can execute those procedures as well as, or better than, human employees, I doubt employers will care if it is reasoning or not.
theherk@lemmy.world 5 weeks ago
Yeah these comments have the three hallmarks of Lemmy:
- AI is just autocomplete mantras.
- Apple is always synonymous with bad and dumb.
- Rare pockets of really thoughtful comments.
Thanks for being at least the latter.
REDACTED@infosec.pub 5 weeks ago
What confuses me is that we seemingly keep pushing away what counts as reasoning. Not too long ago, some smart alghoritms or a bunch of instructions for software (if/then) was officially, by definition, software/computer reasoning. Logically, CPUs do it all the time. Suddenly, when AI is doing that with pattern recognition, memory and even more advanced alghoritms, it’s no longer reasoning? I feel like at this point a more relevant question is “What exactly is reasoning?”. Before you answer, understand that most humans seemingly live by pattern recognition, not reasoning.
stickly@lemmy.world 5 weeks ago
If you want to boil down human reasoning to pattern recognition, the sheer amount of stimuli and associations built off of that input absolutely dwarfs anything an LLM will ever be able to handle. It’s like comparing PhD reasoning to a dog’s reasoning.
While a dog can learn some interesting tricks and the smartest dogs can solve simple novel problems, there are hard limits. They simply lack a strong metacognition and the ability to make simple logical inferences (eg: why they fail at the shell game).
Now we make that chasm even larger by cutting the stimuli to a fixed token limit. An LLM can do some clever tricks within that limit, but it’s designed to do exactly those tricks and nothing more. To get anything resembling human ability you would have to design something to match human complexity, and we don’t have the tech to make a synthetic human.
technocrit@lemmy.dbzer0.com 5 weeks ago
Sure, these grifters are shady AF about their wacky definition of “reason”… But that’s just a continuation of the entire “AI” grift.
MangoCats@feddit.it 5 weeks ago
I think as we approach the uncanny valley of machine intelligence, it’s no longer a cute cartoon but a menacing creepy not-quite imitation of ourselves.
Zacryon@feddit.org 5 weeks ago
Some AI researchers found it obvious as well, in terms of they’ve suspected it and had some indications. But it’s good to see more data on this to affirm this assessment.
kreskin@lemmy.world 5 weeks ago
Lots of who has done some time in search and relevancy early on knew this was always largely breathless overhyped marketing.
jj4211@lemmy.world 5 weeks ago
Particularly to counter some more baseless marketing assertions about the nature of the technology.
technocrit@lemmy.dbzer0.com 5 weeks ago
There’s probably alot of misunderstanding because these grifters intentionally use misleading language: AI, reason, etc.
Tobberone@lemm.ee 5 weeks ago
What statistical method do you base that claim on? The results presented match expectations given that Markov chains are still the basis of inference. What magic juice is added to “reasoning models” that allow them to break free of the inherent boundaries of the statistical methods they are based on?
minoscopede@lemmy.world 5 weeks ago
I think you might not be using the vocabulary correctly. The statement “Markov chains are still the basis of inference” doesn’t make sense, because markov chains are a separate thing. You might be thinking of Markov decision processes, which is used in training RL agents, but that’s also unrelated because these models are not RL agents, they’re supervised learning agents. And even if they were RL agents, the MDP describes the training environment, not the model itself, so it’s not really used for inference.
I’d encourage you to research more about this space and learn more. We need more people who are skeptical of AI doing research in this field, and many of us in the research community would be more than happy to welcome you into it.
Allah@lemm.ee 5 weeks ago
Cognitive scientist Douglas Hofstadter (1979) showed reasoning emerges from pattern recognition and analogy-making - abilities that modern AI demonstrably possesses. The question isn’t if AI can reason, but how its reasoning differs from ours.
mavu@discuss.tchncs.de 5 weeks ago
No way!
Statistical Language models don’t reason?
But OpenAI, robots taking over!
Jhex@lemmy.world 5 weeks ago
this is so Apple, claiming to invent or discover something “first” 3 years later than the rest of the market
postmateDumbass@lemmy.world 5 weeks ago
Trust Apple. Everyone else who were in the space first are lying.
billwashere@lemmy.world 5 weeks ago
When are people going to realize, in its current state , an LLM is not intelligent. It doesn’t reason. It does not have intuition. It’s a word predictor.
x0x7@lemmy.world 5 weeks ago
Intuition is about the only thing it has. It’s a statistical system. The problem is it doesn’t have logic. We assume because its computer based that it must be more logic oriented but it’s the opposite. That’s the problem. We can’t get it to do logic very well because it basically feels out the next token by something like instinct. In particular it doesn’t mask or disconsider irrelevant information very well if it two segments are near each other in embedding space, which doesn’t guarantee relevance. So then the model is just weighing all of this info, relevant or irrelevant to a weighted feeling for the next token.
This is the core problem. People can handle fuzzy topics and discrete topics. But we really struggle to create any system that can do both like we can. Either we create programming logic that is purely discrete or we create statistics that are fuzzy.
Slaxis@discuss.tchncs.de 5 weeks ago
You had a compelling description of how ML models work and just had to swerve into politics, huh?
NotASharkInAManSuit@lemmy.world 5 weeks ago
People think they want AI, but they don’t even know what AI is on a conceptual level.
Buddahriffic@lemmy.world 5 weeks ago
They want something like the Star Trek computer or one of Tony Stark’s AIs that were basically deus ex machinas for solving some hard problem behind the scenes. Then it can say “model solved” or they can show a test simulation where the ship doesn’t explode (or sometimes a test where it only has an 85% chance of exploding when it used to be 100%, at which point human intuition comes in and saves the day by suddenly being better than the AI again and threads that 15% needle or maybe abducts the captain to go have lizard babies with).
AIs that are smarter than us but for some reason don’t replace or even really join us (Vision being an exception to the 2nd, and Ultron trying to be an exception to the 1st).
technocrit@lemmy.dbzer0.com 5 weeks ago
Yeah I often think about this Rick N Morty cartoon. Grifters are like, “We made an AI ankle!!!” And I’m like, “That’s not actually something that people with busted ankles want. They just want to walk, no need for sentience.”
SaturdayMorning@lemmy.ca 5 weeks ago
I agree with you. In its current state, LLM is not sentient, and thus not “Intelligence”.
MouldyCat@feddit.uk 5 weeks ago
I think it’s an easy mistake to confuse sentience and intelligence. It happens in Hollywood all the time - “Skynet began learning at a geometric rate, on July 23 2004 it became self-aware” yadda yadda
But that’s not how sentience works. We don’t have to be as intelligent as Skynet supposedly was in order to be sentient. We don’t start our lives as unthinking robots, and then one day - once we’ve finally got a handle on calculus or a deep enough understanding of the causes of the fall of the Roman empire - we suddenly blink into consciousness. On the contrary, even the stupidest humans are accepted as being sentient. Even a young child, not yet able to walk or do anything more than vomit on their parents’ new sofa, is considered as a conscious individual.
So there is no reason to think that AI - whenever it should be achieved, if ever - will be conscious any more than the dumb computers that precede it.
StereoCode@lemmy.world 5 weeks ago
You’d think the M in LLM would give it away.
jj4211@lemmy.world 5 weeks ago
And that’s pretty damn useful, but obnoxious to have expectations wildly set incorrectly.
sev@nullterra.org 5 weeks ago
Just fancy Markov chains with the ability to link bigger and bigger token sets. It can only ever kick off processing as a response and can never initiate any line of reasoning. This, along with the fact that its working set of data can never be updated moment-to-moment, means that it would be a physical impossibility for any LLM to achieve any real "reasoning" processes.
kescusay@lemmy.world 5 weeks ago
I can envision a system where an LLM becomes one part of a reasoning AI, acting as a kind of fuzzy “dataset” that a proper neural network incorporates and reasons with, and the LLM could be kept real-time updated (sort of) with MCP servers that incorporate anything new it learns.
But I don’t think we’re anywhere near there yet.
riskable@programming.dev 5 weeks ago
The only reason we’re not there yet is memory limitations.
Eventually some company will come out with AI hardware that lets you link up a petabyte of ultra fast memory to chips that contain a million parallel matrix math processors. Then we’ll have an entirely new problem: AI that trains itself incorrectly too quickly.
Just you watch: The next big breakthrough in AI tech will come around 2032-2035 (when the hardware is available) and everyone will be bitching that “chain reasoning” (or whatever the term turns out to be) isn’t as smart as everyone thinks it is.
homura1650@lemm.ee 5 weeks ago
LLMs (at least in their current form) are proper neural networks.
auraithx@lemmy.dbzer0.com 5 weeks ago
Unlike Markov models, modern LLMs use transformers that attend to full contexts, enabling them to simulate structured, multi-step reasoning (albeit imperfectly). While they don’t initiate reasoning like humans, they can generate and refine internal chains of thought when prompted, and emerging frameworks (like ReAct or Toolformer) allow them to update working memory via external tools. Reasoning is limited, but not physically impossible, it’s evolving beyond simple pattern-matching toward more dynamic and compositional processing.
spankmonkey@lemmy.world 5 weeks ago
Reasoning is limited
Most people wouldn’t call zero of something ‘limited’.
riskable@programming.dev 5 weeks ago
I’m not convinced that humans don’t reason in a similar fashion. When I’m asked to produce pointless bullshit at work my brain puts in a similar level of reasoning to an LLM.
Think about “normal” programming: An experienced developer (that’s self-trained on dozens of enterprise code bases) doesn’t have to think much at all about 90% of what they’re coding. It’s all bog standard bullshit so they end up copying and pasting from previous work, Stack Overflow, etc because it’s nothing special.
The remaining 10% is “the hard stuff”. They have to read documentation, search the Internet, and then—after all that effort to avoid having to think—they sigh and start actually start thinking in order to program the thing they need.
LLMs go through similar motions behind the scenes! Probably because they were created by software developers but they still fail at that last 90%: The stuff that requires actual thinking.
Eventually someone is going to figure out how to auto-generate LoRAs based on test cases combined with trial and error that then get used by the AI model to improve itself and that is when people are going to be like, “Oh shit! Maybe AGI really is imminent!” But again, they’ll be wrong.
AGI won’t happen until AI models get good at retraining themselves with something better than basic reinforcement learning. In order for that to happen you need the working memory of the model to be nearly as big as the hardware that was used to train it. That, and loads and loads of spare matrix math processors ready to go for handing that retraining.
vrighter@discuss.tchncs.de 5 weeks ago
previous input goes in. Completely static, prebuilt model processes it and comes up with a probability distribution.
There is no “unlike markov chains”. They are markov chains. Ones with a long context (a markov chain also kakes use of all the context provided to it, so I don’t know what you’re on about there). LLMs are just a (very) lossy compression scheme for the state transition table. Computed once, applied blindly to any context fed in.
brsrklf@jlai.lu 5 weeks ago
You know, despite not really believing LLM “intelligence” works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point…
But that study seems to prove they’re still not even good at that. At first I was wondering how hard the puzzles must have been, and then there’s a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar… Also, failing to apply a step-by-step solution they were given.
auraithx@lemmy.dbzer0.com 5 weeks ago
This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.
technocrit@lemmy.dbzer0.com 5 weeks ago
Computers are awesome at “recognizing patterns” as long as the pattern is a statistical average of some possible worthless data set.
Mniot@programming.dev 5 weeks ago
I don’t think the article summarizes the research paper well. The researchers gave the AI models simple-but-large (which they confusingly called “complex”) puzzles. Like Towers of Hanoi but with 25 discs.
The solution to these puzzles is nothing but patterns. You can write code that will solve the Tower puzzle for any size n and the whole program is less than a screen.
The problem the researchers see is that on these long, pattern-based solutions, the models follow a bad path and then just give up long before they hit their limit on tokens. The researchers don’t have an answer for why this is, but they suspect that the reasoning doesn’t scale.
GaMEChld@lemmy.world 5 weeks ago
Most humans don’t reason. They just parrot shit too. The design is very human.
elbarto777@lemmy.world 5 weeks ago
LLMs deal with tokens. Essentially, predicting a series of bytes.
Humans do much, much, much, much, much, much, much more than that.
Zexks@lemmy.world 5 weeks ago
No. They don’t. We just call them proteins.
skisnow@lemmy.ca 5 weeks ago
I hate this analogy. As a throwaway whimsical quip it’d be fine, but it’s specious enough that I keep seeing it used earnestly by people who think that LLMs are in any way sentient or conscious, so it’s lowered my tolerance for it as a topic even if you did intend it flippantly.
GaMEChld@lemmy.world 5 weeks ago
I don’t mean it to extol LLM’s but rather to denigrate humans. How many of us are self imprisoned in echo chambers so we can have our feelings validated to avoid the uncomfortable feeling of thinking critically and perhaps changing viewpoints?
Humans have the ability to actually think, unlike LLM’s. But it’s frightening how far we’ll go to make sure we don’t.
joel_feila@lemmy.world 5 weeks ago
Thata why ceo love them. When your job is 90% spewing bs a machine that does that is impressive
SpaceCowboy@lemmy.ca 5 weeks ago
Yeah I’ve always said the the flaw in Turing’s Imitation Game concept is that if an AI was indistinguishable from a human it wouldn’t prove it’s intelligent. Because humans are dumb as shit. Dumb enough to force one of the smartest people in the world take a ton of drugs which eventually killed him simply because he was gay.
crunchy@lemmy.dbzer0.com 5 weeks ago
I’ve heard something along the lines of, “it’s not when computers can pass the Turing Test, it’s when they start failing it on purpose that’s the real problem.”
jnod4@lemmy.ca 5 weeks ago
I think that person had to choose between the drugs or hard core prison of the 1950s England where being a bit odd was enough to guarantee an incredibly difficult time as they say in England, I would’ve chosen the drugs as well hoping they would fix me, too bad without testosterone you’re going to be suicidal and depressed, I’d rather choose to keep my hair than to be horny all the time
Zenith@lemm.ee 5 weeks ago
Yeah we’re so stupid we’ve figured out advanced maths, physics, built incredible skyscrapers and the LHC, we may as individuals be less or more intelligent but humans as a whole are incredibly intelligent
bjoern_tantau@swg-empire.de 5 weeks ago
reksas@sopuli.xyz 5 weeks ago
does ANY model reason at all?
4am@lemm.ee 5 weeks ago
No, and to make that work using the current structures we use for creating AI models we’d probably need all the collective computing power on earth at once.
SARGE@startrek.website 5 weeks ago
… So you’re saying there’s a chance?
auraithx@lemmy.dbzer0.com 5 weeks ago
Define reason.
Like humans? Of course not. models lack intent, awareness, and grounded meaning. They don’t “understand” problems, they generate token sequences.
technocrit@lemmy.dbzer0.com 5 weeks ago
Why would they “prove” something that’s completely obvious?
thinking processes
The abstract of their paper is completely pseudo-scientific from the first sentence.
FreakinSteve@lemmy.world 5 weeks ago
NOOOOOOOOO
SHIIIIIIIIIITT
SHEEERRRLOOOOOOCK
surph_ninja@lemmy.world 5 weeks ago
You assume humans do the opposite? We literally institutionalize humans who not follow set patterns.
skisnow@lemmy.ca 5 weeks ago
What’s hilarious/sad is the response to this article over on reddit’s “singularity” sub, in which all the top comments are people who’ve obviously never got all the way through a research paper in their lives all trashing Apple and claiming their researchers don’t understand AI or “reasoning”. It’s a weird cult.
technocrit@lemmy.dbzer0.com 5 weeks ago
Peak pseudo-science. The burden of proof is on the grifters to prove reason. There’s absolutely no reason to disprove something that has no evidence anyway. Apple has no idea what “reason” means. It’s pseudo-science against pseudo-science in a fierce battle.
vala@lemmy.world 5 weeks ago
No shit
RampantParanoia2365@lemmy.world 5 weeks ago
Fucking obviously. Until Data’s positronic brains becomes reality, AI is not actual intelligence.
LonstedBrowryBased@lemm.ee 5 weeks ago
Yah of course they do they’re computers
SplashJackson@lemmy.ca 5 weeks ago
Just like me
Auli@lemmy.ca 5 weeks ago
No shit. This isn’t new.
mfed1122@discuss.tchncs.de 5 weeks ago
This sort of thing has been published a lot for awhile now, but why is it assumed that this isn’t what human reasoning consists of? Isn’t all our reasoning ultimately a form of pattern memorization? I sure feel like it is. So to me all these studies that prove they’re “just” memorizing patterns don’t prove anything, unless coupled with research on the human brain to prove we do something different.
sp3ctr4l@lemmy.dbzer0.com 5 weeks ago
This has been known for years, this is the default assumption of how these models work.
You would have to prove that some kind of actual reasoning has arisen as some kind of emergent conplexity phenomenon… not the other way around.
Corpos have just marketed/gaslit us/themselves so hard that they apparently forgot this.
flandish@lemmy.world 5 weeks ago
stochastic parrots. all of them. just upgraded “soundex” models.
this should be no surprise, of course!
ZILtoid1991@lemmy.world 5 weeks ago
Thank you Captain Obvious! Only those who think LLMs are like “little people in the computer” didn’t knew this already.
SattaRIP@lemmy.blahaj.zone 5 weeks ago
Why tf are you spamming rape stories?
communist@lemmy.frozeninferno.xyz 5 weeks ago
I think it’s important to note (i’m not an llm I know that phrase triggers you to assume I am) that they haven’t proven this as an inherent architectural issue, which I think would be the next step to the assertion.
do we know that they don’t and are incapable of reasoning, or do we just know that for x problems they jump to memorized solutions, is it possible to create an arrangement of weights that can genuinely reason, even if the current models don’t? That’s the big question that needs answered. It’s still possible that we just haven’t properly incentivized reason over memorization during training.
atlien51@lemm.ee 5 weeks ago
Employers who are foaming at the mouth at the thought of replacing their workers with cheap AI:
🫢
intensely_human@lemm.ee 5 weeks ago
Fair, but the same is true of me. I don’t actually “reason”; I just have a set of algorithms memorized by which I propose a pattern that seems like it might match the situation, then a different pattern by which I break the situation down into smaller components and then apply patterns to those components. I keep the process up for a while. If I find a “nasty logic error” pattern match at some point in the process, I “know” I’ve found a “flaw in the argument” or “bug in the design”.
But there’s no from-first-principles method by which I developed all these patterns; it’s just things that have survived the test of time when other patterns have failed me.
I don’t think people are underestimating the power of LLMs to think; I just think people are overestimating the power of humans to do anything other than language prediction and sensory pattern prediction.
Nanook@lemm.ee 5 weeks ago
lol is this news? I mean we call it AI, but it’s just LLM and variants it doesn’t think.
MNByChoice@midwest.social 5 weeks ago
The “Apple” part. CEOs only care what companies say.
kadup@lemmy.world 5 weeks ago
Apple is significantly behind and arrived late to the whole AI hype, so of course it’s in their absolute best interest to keep showing how LLMs aren’t special or amazingly revolutionary.
They’re not wrong, but the motivation is also pretty clear.
JohnEdwa@sopuli.xyz 5 weeks ago
"It’s part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was a chorus of critics to say, ‘that’s not thinking’." -Pamela McCorduck´. It’s called the AI Effect.
kadup@lemmy.world 5 weeks ago
That entire paragraph is much better at supporting the precise opposite argument. Computers can beat Kasparov at chess, but they’re clearly not thinking when making a move - even if we use the most open biological definitions for thinking.
technocrit@lemmy.dbzer0.com 5 weeks ago
There’s nothing more pseudo-scientific than “intelligence” maximization. I’m going to write a program to play tic-tac-toe. If y’all don’t think it’s “AI”, then you’re just haters. Nothing will ever be good enough for y’all. You want scientific evidence of intelligence?!?! I can’t even define intelligence so there! \s
vala@lemmy.world 5 weeks ago
Yesterday I asked an LLM “how much energy is stored in a grand piano?” It responded with saying there is no energy stored in a grad piano because it doesn’t have a battery.
Any reasoning human would have understood that question to be referring to the tension in the strings.
Another example is asking “does line cause kidney stones?”. It didn’t assume I mean lime the mineral and went with lime the citrus fruit instead.
Once again a reasoning human would assume the question is about the mineral.
Ask these questions again in a slightly different way and you might get a correct answer, but it won’t be because the LLM was thinking.
Clent@lemmy.dbzer0.com 5 weeks ago
Proving it matters. Science is constantly proving any other thing that people believe is obvious because people have an uncanning ability to believe things that are false. Some people will believe things long after science has proven them false.
Eatspancakes84@lemmy.world 5 weeks ago
I mean… “proving” is also just marketing speak. There is no clear definition of reasoning, so there’s also no way to prove or disprove that something/someone reasons.
Melvin_Ferd@lemmy.world 5 weeks ago
This is why I say these articles are so similar to how right wing media covers issues about immigrants.
There’s some weird media push to convince the left to hate AI. Think of all the headlines for these issues. There are so many similarities. They’re taking jobs. They are a threat to our way of life. The headlines talk about how they will sexual assault your wife, your children, you. Threats to the environment. There’s articles like this where they take something known as twist it to make it sound nefarious to keep the story alive and avoid decay of interest.
Then when they pass laws, we’re all primed to accept them removing whatever it is that advantageous them and disadvantageous us.
hansolo@lemmy.today 5 weeks ago
Because it’s a fear-mongering angle that still sells. AI has been a vehicle for scifi for so long that trying to convince Boomers that of won’t kill us all is the hard part.
I’m a moderate user for code and skeptic of LLM abilities, but 5 years from now when we are leveraging ML models for groundbreaking science and haven’t been nuked by SkyNet, all of this will look quaint and silly.
technocrit@lemmy.dbzer0.com 5 weeks ago
You mean laws like this? jfc.
www.inc.com/sam-blum/…/91198975