Comment on AI is learning to lie, scheme, and threaten its creators
dream_weasel@sh.itjust.works 1 week ago
Press A to doubt.
LLMs do not have volition, even “reasoning” models. This is still super clippy: you can ask it all you want about the office suite, but anything outside of that context is nonsense to even talk about. Tone or lying or perceived intent or any of that is an artifact of training/fine tuning/distillation. Reasoning logic itself is just trained using good solutions, bad solutions, right answers, wrong answers and “truthiness”. It’s not easy to manufacture a model that is by and large coherent, but in no way is anybody putting forth a path to generalized intelligence.
vithigar@lemmy.ca 1 week ago
On top of that they say that these sorts of behaviors only arise when the models are “stressed”, and the article also mentions “threats” like being unplugged. What kind of response do they actually expect from a fill-in-the-conversation machine when the prompt it’s been asked to continue from is a threat?