Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all.

<- View Parent
vrighter@discuss.tchncs.de ⁨19⁩ ⁨hours⁩ ago

“lacks internal computation” is not part of the definition of markov chains. Only that the output depends only on the current state (the whole context, not just the last token) and no previous history, just like llms do. They do not consider tokens that slid out of the current context, because they are not part of the state anymore.

And it wouldn’t be a cache unless you decide to start invalidating entries, which you could just, not do… it would be a table with token-alphabet-size^context length size, with each entry being a vector of size token_alphabet_size.

The pi example was just to show that how you implement a function (any function) does not matter, as long as the inputs and outputs are the same. Or to put it another way if you give me an index, then you wouldn’t know whether I got the result by doing some computations or using a precomputed table.

Likewise, if you give me a sequence of tokens and I give you a probability distribution, you can’t tell whether I used A NN or just consulted a precomputed table. The point is that given the same input, the table will always give the same result, and crucially, so will an llm. A table is just one type of implementation for an arbitrary function.

There is also no requirement for the state transiiltion function (a table is a special type of function) to be understandable by humans. Just because it’s big enough to be beyond human comprehension, doesn’t change its nature.

source
Sort:hotnewtop