Has anyone ever considered that a chain of reasoning can actually change the output? Because that is fed back into the input prompt. That’s great for math and logic problems, but I don’t think I’d trust the alignment checks.
Reasoning models don't always say what they think.
Submitted 1 week ago by Tea@programming.dev to technology@lemmy.world
https://www.anthropic.com/research/reasoning-models-dont-say-think
Comments
MagicShel@lemmy.zip 1 week ago
DeathsEmbrace@lemm.ee 1 week ago
It’s basically using a reference point and they want to make it sound fancier.
Gibibit@lemmy.world 1 week ago
So chain of thought is an awful experiment that doesn’t let you know how an AI reasons. Instead of admitting this, AI researchers anthropomorphize yet another test result and turn it into the model hiding their thought process from you. Whatever.
A_A@lemmy.world 1 week ago
i like this part :
There’s no specific reason why the reported Chain-of-Thought must accurately reflect the true reasoning process; there might even be circumstances where a model actively hides aspects of its thought process from the user.
cronenthal@discuss.tchncs.de 1 week ago
Because they do not think.
A_A@lemmy.world 1 week ago
Even people don’t always say what they think … and this applies to the few ones who do.