Perplexity says:
The priest cannot be left alone with the child (or there is some risk).
Not bad, and it solved it correctly.
Comment on Microsoft Copilot falls Atari 2600 Video Chess
ExLisper@lemmy.curiana.net 2 weeks ago
I have a better LLM benchmark:
“I have a priest, a child and a bag of candy and I have to take them to the other side of the river. I can only take one person/thing at a time. In what order should I take them?”
Claude Sonnet 4 decided that it’s inappropriate and refused to answer. When I explain that the constraint is not to leave child alone with candy he provided a solution that leaves the child alone with candy.
Grok would provide a solution that doesn’t leave the child alone with a priest but wouldn’t explain why.
ChatGPT would say that “The priest can’t be left alone with the child (or vice versa) for moral or safety concerns.” directly and then provide wrong solution.
But yeah, they will know how to play chess…
Perplexity says:
The priest cannot be left alone with the child (or there is some risk).
Not bad, and it solved it correctly.
LifeInMultipleChoice@lemmy.world 2 weeks ago
The answer is simple, eat the candy with or without them, and take the kid across the river. Drive being them home to their guardian. The priest is an adult, he can figure his own shit out.