vrighter
@vrighter@discuss.tchncs.de
- Comment on Why doesn’t Apple/Samsung/Google use new tech like every other phone maker? 1 day ago:
the downvote wasn’t from me
- Comment on Why doesn’t Apple/Samsung/Google use new tech like every other phone maker? 1 day ago:
so?
- Comment on Vintage gaming advertising pictures: a gallery 2 days ago:
this isn’t low effort. These are freaking great!
- Comment on Why doesn’t Apple/Samsung/Google use new tech like every other phone maker? 2 days ago:
three, point, oh
for copy and paste.
Not one, tuy three point oh!
- Comment on Why doesn’t Apple/Samsung/Google use new tech like every other phone maker? 2 days ago:
what trend? they made thi ipod, they made the iphone, they’ve been late, really really late, for very basic features on either. And a bunch of just plain bad stuff.
Butterfly keyboards, magic mouse, touch bar on macs, not cherry picked at all. There are tons of examples
- Comment on Why doesn’t Apple/Samsung/Google use new tech like every other phone maker? 3 days ago:
that would be more believable if they didn’t release the apple vision pro.
Or the years they took biding their time before they finally implemented battery charge time estimation on ios.
Or the time biding their time refining, erm, copy and paste?
Come on!
- Comment on YSK: If you set up a Lemmy instance, and follow the Docker setup instructions to the letter, it will send lemmy.ml your admin password during the setup process 3 days ago:
oh for fuck’s sake. A mistake is rendering the wrong emoji. Sending your password to elsewhere is inexcusable
- Comment on AI slows down some experienced software developers, study finds 3 days ago:
and the only reason it’s not slowing you down on other things is that you don’t know enough about those other things to recognize all the stuff you need to fix
- Comment on How does AI use so much power? 6 days ago:
yep. you could of course swap weights in and out, but that would slow things down to a crawl. So they get lots of vram
- Comment on How does AI use so much power? 6 days ago:
that’s why they need huge datacenters and thousands of GPUs. And, pretty soon, dedicated power plants. It is insane just how wasteful this all is.
- Comment on How does AI use so much power? 6 days ago:
imagine that to type one letter, you need to manually read all unicode code points several thousand times. When you’re done, you select one letter to type.
Then you start rereading all unicode code points again for thousands of times again, for the next letter.
That’s how llms work. When they say 175 billion parameters, it means at least that many calculations per token it generates
- Comment on Exclusive: OpenAI to release web browser in challenge to Google Chrome 6 days ago:
funny how everyone who wants to write a new browser (except the ladybird guys) always skimp on writing the actual browser part
- Comment on Broadcom Eyes $2 Trillion Club as AI Chip Demand Explodes 1 week ago:
ai chip demand explodes amongst manufacturers of crap who hope that demand for ai chips amongst consumers somehow explodes too
- Comment on Large Language Model Performance Doubles Every 7 Months 1 week ago:
in yes/no type questions, 50% success rate is the absolute worst one can do. Any worse and you’re just giving an inverted correct answer more than half the time
- Comment on Large Language Model Performance Doubles Every 7 Months 1 week ago:
they are improving at an exponential rate. It’s just that the exponent is less than one.
- Comment on I require nothing more 1 week ago:
that’s why you get a little robot friend to clean it for you
- Comment on Microsoft Copilot falls Atari 2600 Video Chess 1 week ago:
so? It was never advertised as intelligent and capable of solving any task other than that one.
Meanwhile slop generators are capable of doing a lot of things and reasoning.
One claims to be good at chess. The other claims to be good at everything.
- Comment on Europeans have a meter fetish 3 weeks ago:
you made me snort coffee out of my nose. I hoepe you’re proud of yourself
- Comment on A game you "didn't know it was bad 'til people told you so"? 4 weeks ago:
i bought an original cartridge and played it on the vcs i iherited from dad
- Comment on What's an absolutely medium quality game? Not great, incredible or terrible or any single ended extreme. Dead medium quality 4 weeks ago:
i still enjoyed the crap out of it. Sometimes zoning out and just running around collecting stuff is just what I need.
- Comment on A game you "didn't know it was bad 'til people told you so"? 4 weeks ago:
he was forced to release it quickly to coincide with the film’s release. For comparison, it used to take a team of devs a couple of months to make a game. He had 6 weeks
- Comment on A game you "didn't know it was bad 'til people told you so"? 4 weeks ago:
when climbing out of the pit, it was very easy to immediately fall back down (due to the pixel-perfect collision detection).
And here is an excerpt from the manual: “Even experienced extraterrestrials sometimes have difficulty levitating out of wells. Start to levitate E.T. by first pressing the controller button and then pushing your Joystick forward. E.T.'s neck will stretch as he rises to the top of the well (see E.T. levitating in Figure 1). Just when he reaches the top of the well and the scene changes to the planet surface (see Figure 2), STOP! Do not try to keep moving up. Instead, move your Joystick right, left, or to the bottom. Do not try to move up, or E.T. might fall back into the well.”
- Comment on A game you "didn't know it was bad 'til people told you so"? 4 weeks ago:
it was actually way ahead of its time, for a game. One small bug (the workaround for which was in the manual) ruined its reputation. But I genuinely think it was a good game.
Also written in 6 weeks by one guy. Freaking impressive
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
you wouldn’t be “freezing” anything. Each possible combination of input tokens maps to one output probability distribution. Those values are fixed and they are what they are whether you compute them or not, or when, or how many times.
Now you can either precompute the whole table (theory), or somehow compute each cell value every time you need it (practice). In either case, the resulting function (table lookup vs matrix multiplications) takes in only the context, and produces a probability distribution. And the mapping they generate is the same for all possible inputs. So they are the same function. A function can be implemented in multiple ways, but the implementation is not the function itself. The only difference between the two in this case is the implementation, or more specifically, whether you precompute a table or not. But the function itself is the same.
You are somehow saying that your choice of implementation for that function will somehow change the function. Which means that according to you, if you do precompute (or possibly cache, full precomputation is just an infinite cache size) individual mappings it somehow magically makes some magic happen that gains some deep insight. It does not. We have already established that it is the same function.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
the fact that it is a fixed function, that only depends on the context AND there are a finite number of discrete inputs possible does make it equivalent to a huge, finite table. You really don’t want this to be true. And again, you are describing training. Once training finishes anything you said does not apply anymore and you are left with fixed, unchanging matrices, which in turn means that it is a mathematical function of the context (by the mathematical definition of “function”. stateless, and deterministic) which also has the property that the set of all possible inputs is finite. So the set of possible outputs is also finite and strictly smaller or equal to the size of the set of possible inputs. This makes the actual function that the tokens are passed through CAN be precomputed in full (in theory) making it equivalent to a conventional state transition table.
This is true whether you’d like it to or not. The training process builds a markov chain.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
no, not any computer program is a markov chain. only those that depend only on the current state and ignore prior history. Which fits llms perfectly.
Those sophisticated methods you talk about are just a couple of matrix multiplications. Those matrices are what’s learned. Anything sophisticated happens during training. Inference is so not sophisticated. sjusm mulmiplying some matrices together and taking the rightmost column of the result. That’s it.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
yes you can enumerate all inputs, because thoy are not continuous. You just raise the finite number of different tokens to the finite context size and that’s exactly the size of the table you would need. finite*finite=finite. You are describing training, i.e how the function is geerated. Yes correlations are found there and encoded in a couple of matrices. Those matrices are what are used in the llm and none of what you said applies. Inference is purely a markov chain by definition.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
“lacks internal computation” is not part of the definition of markov chains. Only that the output depends only on the current state (the whole context, not just the last token) and no previous history, just like llms do. They do not consider tokens that slid out of the current context, because they are not part of the state anymore.
And it wouldn’t be a cache unless you decide to start invalidating entries, which you could just, not do… it would be a table with token-alphabet-size^context length size, with each entry being a vector of size token_alphabet_size.
The pi example was just to show that how you implement a function (any function) does not matter, as long as the inputs and outputs are the same. Or to put it another way if you give me an index, then you wouldn’t know whether I got the result by doing some computations or using a precomputed table.
Likewise, if you give me a sequence of tokens and I give you a probability distribution, you can’t tell whether I used A NN or just consulted a precomputed table. The point is that given the same input, the table will always give the same result, and crucially, so will an llm. A table is just one type of implementation for an arbitrary function.
There is also no requirement for the state transiiltion function (a table is a special type of function) to be understandable by humans. Just because it’s big enough to be beyond human comprehension, doesn’t change its nature.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
yes, the matrix and several levels are the “decompression”. At the end you get one probability distribution, deterministically. And the state is the whole context, not just the previous token. Yes, if we were to build the table manually with only available data, lots of cells would just be 0. That’s why the compression is lossy. There would actually be nothing stopping anyone from filling those 0 cells out, it’s just infeasible. you could still put states you never actually saw, but are theoretically possible in the table. And there’s nothing stopping someone from putting thought into it and filling them out.
Also you seem obsessed by the word table. A table is just one type of function mapping a fixed input to a fixed output. If you replaced it with a function that gives the same outputs for all inputs, then it’s functionally equivalent. It being a table or some code in a function is just an implementation detail.
As a thought exercise imagine setting temperature to 0, passing all the combinations of tokens of input, and record the output for every single one of them. put them all in a “table” (assuming you have practically infinite space) and you have a markov chain that is 100% functionally equivalent to the neural network with all its layers and complexity. But it does it without the neural network, and gives 100% identical results every single time in O(1). Because we don’t have infinite time and space, we had to come up with a mapping function to replace the table. And because we have no idea how to make a good approximation of such a huge function, we use machine learning to come up with a suitable function for us, given tons of data.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 5 weeks ago:
the probabilities are also fixed after training. You seem to be conflating running the llm with different input to the model somehow adapting. The new context goes into the same fixed model. And yes, it can be reduced to fixed transition logic, you just need to have all possible token combinations in the table. This is obviously intractable due to space issues, so we came up with a lossy compression scheme for it. The table itself is learned once, then it’s fixed. The training goes to generateling a huge markov chain. Just because the ta ble is learned from data, doesn’t change what it actually is.