But to be fair, as people we would not ask “how many Rs does strawberry have”, but “with how many Rs do you spell strawberry” or “do you spell strawberry with 1 R or 2 Rs”
Comment on Why I am not impressed by A.I.
Allero@lemmy.today 2 weeks ago
Here’s my guess:
We all know LLMs train on human-generated data. And when we ask something like “how many R’s” or “how many L’s” is in a given word, we don’t mean to count them all - we normally mean something like “how many consecutive letters there are, so I could spell it right”.
Yes, the word “strawberry” has 3 R’s. But what most people are interested in is whether it is “strawberry” or “strawbery”, and their “how many R’s” refers to this exactly, not the entire word.
Opisek@lemmy.world 2 weeks ago
jj4211@lemmy.world 2 weeks ago
It doesn’t even see the word ‘strawberry’, it’s been tokenized in a way to no longer see the ‘text’ that was input.
It’s more like it sees a question like: How many 'r’s in 草莓?
And it spits out an answer not based on analysis of the input, but a model of what people might have said.