Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

<- View Parent
auraithx@lemmy.dbzer0.com ⁨11⁩ ⁨hours⁩ ago

LLMs are not Markov chains, even extended ones. A Markov model, by definition, relies on a fixed-order history and treats transitions as independent of deeper structure. LLMs use transformer attention mechanisms that dynamically weigh relationships between all tokens in the input—not just recent ones. This enables global context modeling, hierarchical structure, and even emergent behaviors like in-context learning. Markov models can’t reweight context dynamically or condition on abstract token relationships.

The idea that LLMs are “computed once” and then applied blindly ignores the fact that LLMs adapt their behavior based on input. They don’t change weights during inference, true—but they do adapt responses through soft prompting, chain-of-thought reasoning, or even emulated state machines via tokens alone. That’s a powerful form of contextual plasticity, not blind table lookup.

Calling them “lossy compressors of state transition tables” misses the fact that the “table” they’re compressing is not fixed—it’s context-sensitive and computed in real time using self-attention over high-dimensional embeddings. That’s not how Markov chains work, even with large windows.

source
Sort:hotnewtop