Comment

Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

Knock_Knock_Lemmy_In@lemmy.world ⁨5⁩ ⁨months⁩ ago

do we know that they don’t and are incapable of reasoning.

“even when we provide the algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve”

Sort:hotnew top

communist@lemmy.frozeninferno.xyz ⁨5⁩ ⁨months⁩ ago
That indicates that it does not follow instructions, not that it is architecturally fundamentally incapable.

source
- Knock_Knock_Lemmy_In@lemmy.world ⁨5⁩ ⁨months⁩ ago
  Not “This particular model”. Frontier LRMs s OpenAI’s o1/o3,DeepSeek-R, Claude 3.7 Sonnet Thinking, and Gemini Thinking.
  
  The paper shows that Large Reasoning Models as defined today cannot interpret instructions. Their architecture does not allow it.
  
  source
  - communist@lemmy.frozeninferno.xyz ⁨5⁩ ⁨months⁩ ago
    those particular models.
    
    source
    Knock_Knock_Lemmy_In@lemmy.world ⁨5⁩ ⁨months⁩ ago
    The architecture of these LRMs may make monkeys fly out of my butt. It hasn’t been proven that the architecture doesn’t allow it.
    
    You are asking to prove a negative. The onus is to show that the architecture can reason. Not to prove that it can’t.
    
    source
    -> View More Comments
    0ops@lemm.ee ⁨5⁩ ⁨months⁩ ago
    Is “model” not defined as architecture+weights? Those models certainly don’t share the same architecture. I might just be confused about your point though
    
    source
    -> View More Comments