Comment on Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

<- View Parent
zalgotext@sh.itjust.works ⁨1⁩ ⁨day⁩ ago

It needed the rules, and it needed carefully worded questions that matched the parameters set by the rules. I bet if the questions’ wording didn’t match your rules so exactly, it would generate worse answers. Heck, I bet if you gave it the rules, then asked several completely unrelated questions, then asked it your carefully worded rules-based questions, it would perform worse, because it’s context window would be muddied. Because that’s what it’s generating responses based on - the contents of it’s context window, coupled with stats-based word generation.

I still maintain that it shouldn’t need the rules if it’s truly reasoning though. LLMs train on a massive set of data, surely the information required to reason out the answers to your container questions is in there. Surely if it can reason, it should be able to generate answers to simple logical puzzles without someone putting most of the pieces together for them first.

source
Sort:hotnewtop