Comment on Study finds that Chat GPT will cheat when given the opportunity and lie to cover it up later.

<- View Parent
kromem@lemmy.world ⁨11⁩ ⁨months⁩ ago

I suggest reading it. Right in the abstract it states the whole point:

Overall, we present evidence that language models linearly represent the truth or falsehood of factual statements.

The full paper goes into detail in multiple methods of analysis to show that it’s the case, and is right there available for you to read.

source
Sort:hotnewtop