auraithx
@auraithx@lemmy.dbzer0.com
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 3 days ago:
Never claimed I was? Computer scientist.
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 3 days ago:
I’m not an engineer.
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 4 days ago:
That isn’t the discussion at hand. Insane you don’t realise that.
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 5 days ago:
Scale? It’s a personal ancestry site for my surname with graphs and shit mate. Compares naming patterns, locations, dna, etc between generations and tries to place loose people. Works pretty well, managed to find a bunch of missing connections through it.
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 5 days ago:
In the case of a senior engineer then they wouldn’t need to worry about the hallucination rate. The LLM is a lot faster than them and they can do other tasks while it’s being generated and then review the outputs. If it’s trivial you’ve saved time, if not, you can pull up that documentation, and reason and step through the problem with the LLM. If you actually know what you’re talking about you can see when it slips up and correct it.
And that hallucination rate is rapidly dropping. We’ve jumped from about 40% accuracy to 90% over the past ~6mo alone (aider polygot coding benchmark) - at about 1/10th the cost (iirc).
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 5 days ago:
I mean before we’d just ask google and read stack, blogs, support posts, etc. Now it just finds them for you instantly so you can just click and read them. The human reasoning part is just shifting elsewhere where you solve the problem during debugging before commits.
- Comment on Exactly Six Months Ago, the CEO of Anthropic Said That in Six Months AI Would Be Writing 90 Percent of Code 5 days ago:
Sounds like they need to work on their prompts. I vibe code some hobby projects I wouldn’t have done otherwise and it’s never done that. I have it comment each change and review it all in diff checker so that’s 90% of the time.
- Comment on Implementing Portable User Identities with DIDs 2 weeks ago:
Bad news champ, it’s already in mainstream use.
- Comment on Meet the AI vegans: They are choosing to abstain from using artificial intelligence for environmental, ethical and personal reasons. Maybe they have a point 1 month ago:
Ai vegans are people who only use ai generated porn.
- Comment on Why democrats under Biden administration didn't release Epstein files? 1 month ago:
It’s standard DOJ protocol to avoid compromising witnesses, active leads, or uncharged co-conspirators. Releasing unredacted files mid-investigation could tank prosecutions.
- Comment on Why democrats under Biden administration didn't release Epstein files? 1 month ago:
The investigation was still ongoing so they couldn’t have. It didn’t stop until Trump stopped it the other week.
- Comment on Adblockers stop publishers serving ads to (or even seeing) 1bn web users - Press Gazette 1 month ago:
What device? FireTV/Firestick/etc all support it (surprisingly).
- Comment on 1 month ago:
Matrix 2.0 is much faster, but seems like they’ve been building it for a decade.
The app is out, but still no Spaces support; which is what makes it a competitor to Discord.
- Comment on Adblockers stop publishers serving ads to (or even seeing) 1bn web users - Press Gazette 1 month ago:
SmartTube is so much better. Even the UI is intuitive and makes sense. You can hide shorts, actually find content you want to watch.
- Comment on Why democrats under Biden administration didn't release Epstein files? 1 month ago:
?? Yes which Trump ended. Investigation was ongoing, Trump ended it. What aren’t you getting?
- Comment on Why democrats under Biden administration didn't release Epstein files? 1 month ago:
The investigation was still ongoing, which Trump ended. And the DOJ is supposed to operate independently from the president.
- Comment on Why democrats under Biden administration didn't release Epstein files? 1 month ago:
They were sealed until Jan 2024 as part of Maxwells appeal process.
- Comment on its painful each time (┬┬﹏┬┬) 1 month ago:
Tone down step 3 and cancel step 4.
- Comment on I didn't know that they have something in common... 1 month ago:
It’s extra to reserve your seat and they purposely assign the seats at random if you don’t so that you’re incentivised to pay for the reservation.
- Comment on [deleted] 2 months ago:
Don’t worry, I’m poor so I won’t be visiting either.
- Comment on YouTube Will Add an AI Slop Button Thanks to Google’s Veo 3 2 months ago:
Delete your account.
- Comment on [deleted] 2 months ago:
The AI won’t return wrong results when using reference data. Plus, there will be references to the actual data to check.
Regardless, just delete all your socials and only keep psuedo anon ones.
- Comment on YouTube Will Add an AI Slop Button Thanks to Google’s Veo 3 2 months ago:
There’s an example right in the article.
Historical events portrayed realistically is one.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 3 months ago:
You’re absolutely right that inference in an LLM is a fixed, deterministic function after training, and that the input space is finite due to the discrete token vocabulary and finite context length. So yes, in theory, you could precompute every possible input-output mapping and store them in a giant table. That much is mathematically valid. But where your argument breaks down is in claiming that this makes an LLM equivalent to a conventional Markov chain in function or behavior.
A Markov chain is not simply defined as “a function from finite context to next-token distribution.” It is defined by a specific type of process where the next state depends on the current state via fixed transition probabilities between discrete states. The model operates over symbolic states with no internal computation. LLMs, even during inference, compute outputs via multi-layered continuous transformations, with attention mixing, learned positional embeddings, and non-linear activations. These mechanisms mean that while the function is fixed, its structure does not resemble a state machine—it resembles a hierarchical pattern recognizer and function approximator.
Your claim is essentially that “any deterministic function over a finite input space is equivalent to a table.” This is true in a computational sense but misleading in a representational and behavioral sense. If I gave you a function that maps 4096-bit inputs to 50257-dimensional probability vectors and said, “This is equivalent to a transition table,” you could technically agree, but the structure and generative capacity of that function is not Markovian. That function may simulate reasoning, abstraction, and composition. A Markov chain never does.
You are collapsing implementation equivalence (yes, the function could be stored in a table) with model equivalence (no, it does not behave like a Markov chain). The fact that you could freeze the output behavior into a lookup structure doesn’t change that the lookup structure is derived from a fundamentally different class of computation.
The training process doesn’t “build a Markov chain.” It builds a function that estimates conditional token probabilities via optimization over a non-Markov architecture. The inference process then applies that function. That makes it a stateless function, yes—but not a Markov chain. Determinism plus finiteness does not imply Markovian behavior.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 3 months ago:
Yes, LLM inference consists of deterministic matrix multiplications applied to the current context. But that simplicity in operations does not make it equivalent to a Markov chain. The definition of a Markov process requires that the next output depends only on the current state. You’re assuming that the LLM’s “state” is its current context window. But in an LLM, this “state” is not discrete. It is a structured, deeply encoded set of vectors shaped by non-linear transformations across layers. The state is not just the visible tokens—it is the full set of learned representations computed from them.
A Markov chain transitions between discrete, enumerable states with fixed transition probabilities. LLMs instead apply a learned function over a high-dimensional, continuous input space, producing outputs by computing context-sensitive interactions. These interactions allow generalization and compositionality, not just selection among known paths.
The fact that inference uses fixed weights does not mean it reduces to a transition table. The output is computed by composing multiple learned projections, attention mechanisms, and feedforward layers that operate in ways no Markov chain ever has. You can’t describe an attention head with a transition matrix. You can’t reduce positional encoding or attention-weighted context mixing into state transitions. These are structured transformations, not symbolic transitions.
You can describe any deterministic process as a function, but not all deterministic functions are Markovian. What makes a process Markov is not just forgetting prior history. It is having a fixed, memoryless probabilistic structure where transitions depend only on a defined discrete state. LLMs don’t transition between states in this sense. They recompute probability distributions from scratch each step, based on context-rich, continuous-valued encodings. That is not a Markov process. It’s a stateless function approximator conditioned on a window, built to generalize across unseen input patterns.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 3 months ago:
You can say that the whole system is deterministic and finite, so you could record every input-output pair. But you could do that for any program. That doesn’t make every deterministic function a Markov process. It just means it is representable in a finite way. The question is not whether the function can be stored. The question is whether its behavior matches the structure and assumptions of a Markov model. In the case of LLMs, it does not.
Inference does not become a Markov chain simply because it returns a distribution based on current input. It becomes a sequence of deep functional computations where attention mechanisms simulate hierarchical, relational, and positional understanding of language. That does not align with the definition or behavior of a Markov model, even if both map a state to a probability distribution. The structure of the computation, not just the input-output determinism, is what matters.
- Comment on NOOOOOOO 3 months ago:
20 seconds is the full duration. 1-2s to start.
- Comment on NOOOOOOO 3 months ago:
On average, it takes most mammals, including humans, about 12 seconds to have a bowel movement.
Why tf are you having to pass time?
- Comment on NOOOOOOO 3 months ago:
We know we need fibre now. If it’s taking you more than 20 seconds to shit you’re gonna die early.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 3 months ago:
You’re correct that the formal definition of a Markov process does not exclude internal computation, and that it only requires the next state to depend solely on the current state. But what defines a classical Markov chain in practice is not just the formal dependency structure but how the transition function is structured and used. A traditional Markov chain has a discrete and enumerable state space with explicit, often simple transition probabilities between those states. LLMs do not operate this way.
The claim that an LLM is “just” a large compressed Markov chain assumes that its function is equivalent to a giant mapping of input sequences to output distributions. But this interpretation fails to account for the fundamental difference in how those distributions are generated. An LLM is not indexing a symbolic structure. It is computing results using recursive transformations across learned embeddings, where those embeddings reflect complex relationships between tokens, concepts, and tasks. That is not reducible to discrete symbolic transitions without losing the model’s generalization capabilities. You could record outputs for every sequence, but the moment you present a sequence that wasn’t explicitly in that set, the Markov table breaks. The LLM does not.
Yes, you can say a table is just one implementation of a function, and from a purely mathematical perspective, any function can be implemented as a table given enough space. But the LLM’s function is general-purpose. It extrapolates. A precomputed table cannot do this unless those extrapolations are already baked in, in which case you are no longer talking about a classical Markov system. You are describing a model that encodes relationships far beyond discrete transitions.
The pi analogy applies to deterministic functions with fixed outputs, not to learned probabilistic functions that approximate conditional distributions over language. If you give an LLM a new input, it will return a meaningful distribution even if it has never seen anything like it. That behavior depends on internal structure, not retrieval. Just because a function is deterministic at temperature 0 does not mean it is a transition table. The fact that the same input yields the same output is true for any deterministic function. That does not collapse the distinction between generalization and enumeration.
So while yes, you can implement any deterministic function as a lookup table, the nature of LLMs lies in how they model relationships and extrapolate from partial information. That ability is not captured by any classical Markov model, no matter how large.