Comment on Anthropic has developed an AI 'brain scanner' to understand how LLMs work and it turns out the reason why chatbots are terrible at simple math and hallucinate is weirder than you thought

hersh@literature.cafe ⁨2⁩ ⁨days⁩ ago

But here’s the really funky bit. If you ask Claude how it got the correct answer of 95, it will apparently tell you, “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” But that actually only reflects common answers in its training data as to how the sum might be completed, as opposed to what it actually did.

This is not surprising. LLMs are not designed to have any introspection capabilities.

Introspection could probably be tacked onto existing architectures in a few different ways, but as far as I know nobody’s done it yet. It will be interesting to see how that might change LLM behavior. I suspect it is requisite but not sufficient for self-awareness.

source
Sort:hotnewtop