Comment on How in the hell is this possible?
FaceDeer@kbin.social 11 months ago
LLMs have some difficulty with reasoning, especially low-parameter models like this one. This is pretty typical of the current state of the art. Bigger LLMs do a much better job.
FinallyDebunked@slrpnk.net 11 months ago
Yes, but I’m sure any other model 7b or even less wouldn’t give “three” after having written “two-headed”, just because of the way probability works
FaceDeer@kbin.social 11 months ago
I just fired up Llama2-70B, the biggest model I happen to have handy on my local machine, and repeated your exact prompt to it four times. The answers it gave were:
So one correct guess out of four attempts. Not a great showing. So I tried a more prompt-engineerish approach and asked it:
And gave it another four attempts. Its responses were:
Response 1:
Response 2:
Response 3:
Response 4:
So that was kind of interesting. It didn't get any more accurate - still just one "success" out of four - but by rambling on about its reasoning I think we can see how this particular model is getting tripped up and ending up at four so often. It's correctly realizing that it needs to double the number of horns, but it's mistakenly doubling it twice. Perhaps mixtral-8x7b is going down a similar erroneous route.