LLM wasnât made for this
Thereâs a thought experiment that challenges the concept of cognition, called The Chinese Room. What it essentially postulates is a conversation between two people, one of whom is speaking Chinese and getting responses in Chinese. And the first speaker wonders âDoes my conversation partner really understand what Iâm saying or am I just getting elaborate stock answers from a big library of pre-defined replies?â
The LLM is literally a Chinese Room. And one way we can know this is through these interactions. The machine isnât analyzing the fundamental meaning of what Iâm saying, it is simply mapping the words Iâve input onto a big catalog of responses and giving me a standard output. In this case, the problem the machine is running into is a legacy meme about people miscounting the number of "r"s in the word Strawberry. So â2â is the stock response it knows via the meme reference, even though a much simpler and dumber machine that was designed to handle this basic input question could have come up with the answer faster and more accurately.
When you hear people complain about how the LLM âwasnât made for thisâ, what theyâre really complaining about is their own shitty methodology. They build a glorified card catalog. A device that can only take inputs, feed them through a massive library of responses, and sift out the highest probability answer without actually knowing what the inputs or outputs signify cognitively.
Even if you want to argue that having a natural language search engine is useful (damn, wish we had a tool that did exactly this back in August of 1996, amirite?), the implementation of the current iteration of these tools is dogshit because the developers did a dogshit job of sanitizing and rationalizing their library of data.
Imagine asking a librarian âWhat was happening in Los Angeles in the Summer of 1989?â and that person fetching you back a stack of history textbooks, a stack of Sci-Fi screenplays, a stack of regional newspapers, and a stack of Iron-Man comic books all given equal weight? Imagine hearing the plot of the Terminator and Escape from LA intercut with local elections and the Loma Prieta earthquake.
Thatâs modern LLMs in a nutshell.
jsomae@lemmy.ml â¨2⊠â¨days⊠ago
Youâve missed something about the Chinese Room. The solution to the Chinese Room riddle is that it is not the person in the room but rather the room itself that is communicating with you. The fact that thereâs a person there is irrelevant, and they could be replaced with a speaker or computer terminal.
Put differently, itâs not an indictment of LLMs that they are merely Chinese Rooms, but rather one should be impressed that the Chinese Room is so capable despite being a completely deterministic machine.
If one day we discover that the human brain works on much simpler principles than we once thought, would that make humans any less valuable? It should be deeply troubling to us that LLMs can do so much while the mathematics behind them are so simple. Arguments that because LLMs are just scaled-up autocomplete they surely canât be very good at anything are not comforting to me at all.
kassiopaea@lemmy.blahaj.zone â¨2⊠â¨days⊠ago
This. I often see people shitting on AI as âfancy autocompleteâ or joking about how they get basic things incorrect like this post but completely discount how incredibly fucking capable they are in every domain that actually matters. Thatâs what we should be worried about⌠what does it matter that it doesnât âwork the sameâ if it still accomplishes the vast majority of the same things? The fact that we can get something that even approximates logic and reasoning ability from a deterministic system is terrifying on implications alone.
Knock_Knock_Lemmy_In@lemmy.world â¨2⊠â¨days⊠ago
Why doesnât the LLM know to write (and run) a program to calculate the number of characters?
I feel like Iâm missing something fundamental.
OsrsNeedsF2P@lemmy.ml â¨1⊠â¨day⊠ago
You didnât get good answers so Iâll explain.
First, an LLM can easily write a program to calculate the number of
r
s. If you ask an LLM to do this, you will get the code back.But the website ChatGPT.com has no way of executing this code, even if it was generated.
The second explanation is how LLMs work. They work on the word (technically token, but think word) level. They donât see letters. The AI behind it literally can only see words. The way it generates output is it starts typing words, and then guesses what word is most likely to come next. So it literally does not know how many
r
s are in strawberry. The impressive part is how good this âguessing what word comes nextâ is at answering more complex questions.outhouseperilous@lemmy.dbzer0.com â¨2⊠â¨days⊠ago
It doesnât know things.
Itâs a statistical model. It cannot synthesize information ir problem solve, only show you a rough average of its library if inputs graphed by proximity to your input.
jsomae@lemmy.ml â¨2⊠â¨days⊠ago
The LLM isnât aware of its own limitations in this regard. The specific problem of getting an LLM to know what characters a token comprises has not been the focus of training. Itâs a totally different kind of error than other hallucinations, itâs almost entirely orthogonal, but other hallucinations are much more important to solve, whereas being able to count the number of letters in a word or add numbers together is not very important, since as you point out, there are already programs that can do that.
UnderpantsWeevil@lemmy.world â¨2⊠â¨days⊠ago
Iâd be more impressed if the room could tell me how many "r"s are in Strawberry inside five minutes.
Human biology, famous for being simple and straightforward.
outhouseperilous@lemmy.dbzer0.com â¨2⊠â¨days⊠ago
Ah! But you can skip all that messy biology abd stuff i donât understand thatâs probably not important, abd just think of it as a classical computer running an x86 architecture, and checkmate, liberal my argument owns you now!
jsomae@lemmy.ml â¨1⊠â¨day⊠ago
Because LLMs operate at the token level, I think it would be a more fair comparison with humans to ask why humans canât produce the IPA spelling words they can say, /nÉr kĂŚn ðeÉŞ ËizÉli rid θɪĹz ËrÉŞtÉn ËpjĘrli ÉŞn aÉŞ pi ËeÉŞ/ despite the fact that it should be simple to â they understand the sounds after all. Iâd be impressed if somebody could do this too! But that most people canât shouldnât really move you to think humans must be fundamentally stupid because of this one curious artifact.
UnderpantsWeevil@lemmy.world â¨1⊠â¨day⊠ago
Thatâs just access to the right keyboard interface. Humans can and do produce those spellings with additional effort or advanced tool sets.
Humans turns oatmeal into essays via a curios lump of muscle is an impressive enough trick on its face.
LLMs have 95% of the work of human intelligence handled for them and still stumble on the last bits.
outhouseperilous@lemmy.dbzer0.com â¨2⊠â¨days⊠ago
Its not a fucking riddle, itâs a koan/thought experiment.
Itâs questioning what âcommunicationâ fundamentally is, and what knowledge fundamentally is.
Itâs not even the first thing to do this. Military theory was cracking away at the âcommunicationâ thing a century before, and the nature of knowledge has discourse going back thousands of years.
jsomae@lemmy.ml â¨1⊠â¨day⊠ago
Youâre right, I shouldnât have called it a riddle. Still, being a fucking thought experiment doesnât preclude having a solution. Theseusâ ship is another famous fucking thought experiment, which has also been solved.
outhouseperilous@lemmy.dbzer0.com â¨1⊠â¨day⊠ago
âA solutionâ
Thatâs not even remotely the point. Yes there are nany valid solutions. The point isnât to solve it, but what how you solve it says about and clarifies your ideas.