Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

<- View Parent
auraithx@lemmy.dbzer0.com ⁨11⁩ ⁨hours⁩ ago

Because transformer architecture is not equivalent to a probabilistic lookup. A Markov chain assigns probabilities based on a fixed-order state transition, without regard to deeper structure or token relationships. An LLM processes the full context through many layers of non-linear functions and attention heads, each layer dynamically weighting how each token influences every other token.

Although weights do not change during inference, the behavior of the model is not fixed in the way a Markov chain’s state table is. The same model can respond differently to very similar prompts, not just because the inputs differ, but because the model interprets structure, syntax, and intent in ways that are contextually dependent. That is not just longer context—it is fundamentally more expressive computation.

The process is stateless across calls, yes, but it is not blind. All relevant information lives inside the prompt, and the model uses the attention mechanism to extract meaning from relationships across the sequence. Each new input changes the internal representation, so the output reflects contextual reasoning, not a static response to a matching pattern. Markov chains cannot replicate this kind of behavior no matter how many states they include.

source
Sort:hotnewtop