Comment

Comment on I'm looking for an article showing that LLMs don't know how they work internally

ohwhatfollyisman@lemmy.world ⁨11⁩ ⁨months⁩ ago

but there’s been significant research and progress in tracing internals of LLMs, that show logic, planning, and reasoning.

would there be a source for such research?

Sort:hotnew top

theunknownmuncher@lemmy.world ⁨11⁩ ⁨months⁩ ago
anthropic.com/…/tracing-thoughts-language-model for one, the exact article OP was asking for

source
- ohwhatfollyisman@lemmy.world ⁨11⁩ ⁨months⁩ ago
  but this article espouses that llms do the opposite of logic, planning, and reasoning?
  
  quoting:
  
  Claude, on occasion, will give a plausible-sounding argument designed to agree with the user rather than to follow logical steps. We show this by asking it for help on a hard math problem while giving it an incorrect hint. We are able to “catch it in the act” as it makes up its fake reasoning,
  
  are there any sources which show that llms use logic, conduct planning, and reason (as was asserted in the 2nd level comment)?
  
  source
  - theunknownmuncher@lemmy.world ⁨11⁩ ⁨months⁩ ago
    No, you’re misunderstanding the findings. It does show that LLMs do not explain their reasoning when asked, which makes sense and is expected. They do not have access to their inner-workings and generate a response that “sounds” right, but tracing their internal logic shows they operate differently than what they claim, when asked. You can’t ask an LLM to explain its own reasoning. But the article shows how they’ve made progress with tracing under-the-hood, and the surprising results they found about how it is able to do things like plan ahead, which defeats the misconception that it is just “autocomplete”
    
    source