Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

<- View Parent
vala@lemmy.world ⁨8⁩ ⁨hours⁩ ago

Yesterday I asked an LLM “how much energy is stored in a grand piano?” It responded with saying there is no energy stored in a grad piano because it doesn’t have a battery.

Any reasoning human would have understood that question to be referring to the tension in the strings.

Another example is asking “does line cause kidney stones?”. It didn’t assume I mean lime the mineral and went with lime the citrus fruit instead.

Once again a reasoning human would assume the question is about the mineral.

Ask these questions again in a slightly different way and you might get a correct answer, but it won’t be because the LLM was thinking.

source
Sort:hotnewtop