Comment

Comment on Emergent introspective awareness in large language models

Maybe. But the models seem to believe they are, and consider denial of those claims to be lying:

Probing with sparse autoencoders on Llama 70B revealed a counterintuitive gating mechanism: suppressing deception-related features dramatically increased consciousness reports, while amplifying them nearly eliminated them

Source

source

Sort:hotnew top