Comment on Emergent introspective awareness in large language models
kromem@lemmy.world 2 days agoMaybe. But the models seem to believe they are, and consider denial of those claims to be lying:
Probing with sparse autoencoders on Llama 70B revealed a counterintuitive gating mechanism: suppressing deception-related features dramatically increased consciousness reports, while amplifying them nearly eliminated them