Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

brsrklf@jlai.lu ⁨2⁩ ⁨weeks⁩ ago

You know, despite not really believing LLM “intelligence” works anywhere like real intelligence, I kind of thought maybe being good at recognizing patterns was a way to emulate it to a point…

But that study seems to prove they’re still not even good at that. At first I was wondering how hard the puzzles must have been, and then there’s a bit about LLM finishing 100 move towers of Hanoï (on which they were trained) and failing 4 move river crossings. Logically, those problems are very similar… Also, failing to apply a step-by-step solution they were given.

source
Sort:hotnewtop