auraithx
@auraithx@lemmy.dbzer0.com
- Comment on [deleted] 3 days ago:
Don’t worry, I’m poor so I won’t be visiting either.
- Comment on YouTube Will Add an AI Slop Button Thanks to Google’s Veo 3 4 days ago:
Delete your account.
- Comment on What will be the future of "Social Media Investigations"? Is every country gonna start checking for social media posts before permitting entry? 4 days ago:
The AI won’t return wrong results when using reference data. Plus, there will be references to the actual data to check.
Regardless, just delete all your socials and only keep psuedo anon ones.
- Comment on YouTube Will Add an AI Slop Button Thanks to Google’s Veo 3 5 days ago:
There’s an example right in the article.
Historical events portrayed realistically is one.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
You’re absolutely right that inference in an LLM is a fixed, deterministic function after training, and that the input space is finite due to the discrete token vocabulary and finite context length. So yes, in theory, you could precompute every possible input-output mapping and store them in a giant table. That much is mathematically valid. But where your argument breaks down is in claiming that this makes an LLM equivalent to a conventional Markov chain in function or behavior.
A Markov chain is not simply defined as “a function from finite context to next-token distribution.” It is defined by a specific type of process where the next state depends on the current state via fixed transition probabilities between discrete states. The model operates over symbolic states with no internal computation. LLMs, even during inference, compute outputs via multi-layered continuous transformations, with attention mixing, learned positional embeddings, and non-linear activations. These mechanisms mean that while the function is fixed, its structure does not resemble a state machine—it resembles a hierarchical pattern recognizer and function approximator.
Your claim is essentially that “any deterministic function over a finite input space is equivalent to a table.” This is true in a computational sense but misleading in a representational and behavioral sense. If I gave you a function that maps 4096-bit inputs to 50257-dimensional probability vectors and said, “This is equivalent to a transition table,” you could technically agree, but the structure and generative capacity of that function is not Markovian. That function may simulate reasoning, abstraction, and composition. A Markov chain never does.
You are collapsing implementation equivalence (yes, the function could be stored in a table) with model equivalence (no, it does not behave like a Markov chain). The fact that you could freeze the output behavior into a lookup structure doesn’t change that the lookup structure is derived from a fundamentally different class of computation.
The training process doesn’t “build a Markov chain.” It builds a function that estimates conditional token probabilities via optimization over a non-Markov architecture. The inference process then applies that function. That makes it a stateless function, yes—but not a Markov chain. Determinism plus finiteness does not imply Markovian behavior.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Yes, LLM inference consists of deterministic matrix multiplications applied to the current context. But that simplicity in operations does not make it equivalent to a Markov chain. The definition of a Markov process requires that the next output depends only on the current state. You’re assuming that the LLM’s “state” is its current context window. But in an LLM, this “state” is not discrete. It is a structured, deeply encoded set of vectors shaped by non-linear transformations across layers. The state is not just the visible tokens—it is the full set of learned representations computed from them.
A Markov chain transitions between discrete, enumerable states with fixed transition probabilities. LLMs instead apply a learned function over a high-dimensional, continuous input space, producing outputs by computing context-sensitive interactions. These interactions allow generalization and compositionality, not just selection among known paths.
The fact that inference uses fixed weights does not mean it reduces to a transition table. The output is computed by composing multiple learned projections, attention mechanisms, and feedforward layers that operate in ways no Markov chain ever has. You can’t describe an attention head with a transition matrix. You can’t reduce positional encoding or attention-weighted context mixing into state transitions. These are structured transformations, not symbolic transitions.
You can describe any deterministic process as a function, but not all deterministic functions are Markovian. What makes a process Markov is not just forgetting prior history. It is having a fixed, memoryless probabilistic structure where transitions depend only on a defined discrete state. LLMs don’t transition between states in this sense. They recompute probability distributions from scratch each step, based on context-rich, continuous-valued encodings. That is not a Markov process. It’s a stateless function approximator conditioned on a window, built to generalize across unseen input patterns.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
You can say that the whole system is deterministic and finite, so you could record every input-output pair. But you could do that for any program. That doesn’t make every deterministic function a Markov process. It just means it is representable in a finite way. The question is not whether the function can be stored. The question is whether its behavior matches the structure and assumptions of a Markov model. In the case of LLMs, it does not.
Inference does not become a Markov chain simply because it returns a distribution based on current input. It becomes a sequence of deep functional computations where attention mechanisms simulate hierarchical, relational, and positional understanding of language. That does not align with the definition or behavior of a Markov model, even if both map a state to a probability distribution. The structure of the computation, not just the input-output determinism, is what matters.
- Comment on NOOOOOOO 2 weeks ago:
20 seconds is the full duration. 1-2s to start.
- Comment on NOOOOOOO 2 weeks ago:
On average, it takes most mammals, including humans, about 12 seconds to have a bowel movement.
Why tf are you having to pass time?
- Comment on NOOOOOOO 2 weeks ago:
We know we need fibre now. If it’s taking you more than 20 seconds to shit you’re gonna die early.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
You’re correct that the formal definition of a Markov process does not exclude internal computation, and that it only requires the next state to depend solely on the current state. But what defines a classical Markov chain in practice is not just the formal dependency structure but how the transition function is structured and used. A traditional Markov chain has a discrete and enumerable state space with explicit, often simple transition probabilities between those states. LLMs do not operate this way.
The claim that an LLM is “just” a large compressed Markov chain assumes that its function is equivalent to a giant mapping of input sequences to output distributions. But this interpretation fails to account for the fundamental difference in how those distributions are generated. An LLM is not indexing a symbolic structure. It is computing results using recursive transformations across learned embeddings, where those embeddings reflect complex relationships between tokens, concepts, and tasks. That is not reducible to discrete symbolic transitions without losing the model’s generalization capabilities. You could record outputs for every sequence, but the moment you present a sequence that wasn’t explicitly in that set, the Markov table breaks. The LLM does not.
Yes, you can say a table is just one implementation of a function, and from a purely mathematical perspective, any function can be implemented as a table given enough space. But the LLM’s function is general-purpose. It extrapolates. A precomputed table cannot do this unless those extrapolations are already baked in, in which case you are no longer talking about a classical Markov system. You are describing a model that encodes relationships far beyond discrete transitions.
The pi analogy applies to deterministic functions with fixed outputs, not to learned probabilistic functions that approximate conditional distributions over language. If you give an LLM a new input, it will return a meaningful distribution even if it has never seen anything like it. That behavior depends on internal structure, not retrieval. Just because a function is deterministic at temperature 0 does not mean it is a transition table. The fact that the same input yields the same output is true for any deterministic function. That does not collapse the distinction between generalization and enumeration.
So while yes, you can implement any deterministic function as a lookup table, the nature of LLMs lies in how they model relationships and extrapolate from partial information. That ability is not captured by any classical Markov model, no matter how large.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
This is an elegant metaphor, but it fails to capture the essential difference between symbolic enumeration and neural computation. Representing an LLM as a decompression function that reconstructs a giant transition table assumes that the model is approximating a complete, enumerable mapping of inputs to outputs. That’s not what is happening. LLMs are not trained to reproduce every possible sequence. They are trained to generalize over an effectively infinite space of token combinations, including many never seen during training.
Your thought experiment—recording the output for every possible input at temperature 0—would indeed give you a deterministic function that could be stored. But this imagined table is not a Markov chain. It is a cached output of a deep contextual function, not a probabilistic state machine. A Markov model, by definition, uses transition probabilities based on fixed state history and lacks internal computation. An LLM generates the distribution through recursive transformation of continuous embeddings with positional and attention-based conditioning. That is not equivalent to symbolically defining state transitions, even if you could record the output for every input.
The analogy to a spigot algorithm for pi misses the point. That algorithm computes digits of a predefined number. An LLM doesn’t compute a predetermined output. It computes a probability distribution conditioned on a context it was never explicitly trained on, using representations learned across many dimensions. The model encodes distributed knowledge and compositional patterns. A Markov table does not. Even a giant table with manually filled hypothetical entries lacks the inductive bias, generalization, and emergent capabilities that arise from the structure of a trained network.
Equivalence in output does not imply equivalence in function. Replacing a rich model with an exhaustively recorded output set may yield the same result, but it loses what makes the model powerful: the reasoning behavior from structure, not just output recall. The function is not a shortcut to a table. It is the intelligence.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
This argument collapses the entire distinction between parametric modeling and symbolic lookup. Yes, the weights are fixed after training, but the key point is that an LLM does not store or retrieve a state transition table. It learns to approximate the probability of the next token given a sequence through function approximation, not by memorizing discrete transitions. What appears to be a “table” is actually a deep, distributed representation compressed into continuous weight matrices. It is not indexing state transitions—it is computing probabilities from patterns in the input space.
A true Markov chain defines transition probabilities over explicit states. An LLM embeds tokens into high-dimensional vectors, then transforms them repeatedly using self-attention and feedforward layers that can capture subtle syntactic, semantic, and structural features. These features interact in nonlinear ways that go far beyond what any finite transition table could express. You cannot meaningfully represent an LLM’s behavior as a finite Markov model, even in principle, because its representations are not enumerable states but regions of a continuous latent space.
Saying “you just need all token combinations in a table” ignores the fact that the model generalizes to combinations never seen during training. That is the core of its power. It doesn’t look up learned transitions—it constructs responses by interpolating through an embedding space guided by attention and weight structure. No Markov chain does this. A lossy compressor of a transition table still implies a symbolic map; a neural network is a differentiable function trained to fit a distribution, not to encode it explicitly.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Your conflating surface-level architectural limits with core functional behaviour. Yes, an LLM is deterministic at temperature 0 and produces the same output for the same input, but that does not make it equivalent to a Markov chain. A Markov chain defines transitions based on fixed-order memory and static probabilities. An LLM generates output by applying a series of matrix multiplications, activations, and attention-weighted context aggregations across multiple layers, where the representation of each token is conditioned on the entire input sequence, not just on recent tokens.
While the model has a maximum token limit, it does not receive a fixed-length input filled with nulls. It processes variable-length input sequences up to the context limit, and attention masks control which positions are used. These are not hardcoded state transitions; they are dynamically computed weightings over continuous embeddings, where meaning arises from the interaction of tokens, not from simple position or order alone.
Saying that output diversity is just randomness misunderstands why random sampling exists: to explore the rich distribution the model has learned from data, not to fake intelligence. The depth of its output space comes from how it models relationships, hierarchies, syntax, and semantics through training. Markov chains do not do any of this. They map sequences to likely next symbols without modeling internal structure. An LLM’s output reflects high-dimensional reasoning over the prompt. That behavior cannot be reduced to fixed transition logic.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Because transformer architecture is not equivalent to a probabilistic lookup. A Markov chain assigns probabilities based on a fixed-order state transition, without regard to deeper structure or token relationships. An LLM processes the full context through many layers of non-linear functions and attention heads, each layer dynamically weighting how each token influences every other token.
Although weights do not change during inference, the behavior of the model is not fixed in the way a Markov chain’s state table is. The same model can respond differently to very similar prompts, not just because the inputs differ, but because the model interprets structure, syntax, and intent in ways that are contextually dependent. That is not just longer context—it is fundamentally more expressive computation.
The process is stateless across calls, yes, but it is not blind. All relevant information lives inside the prompt, and the model uses the attention mechanism to extract meaning from relationships across the sequence. Each new input changes the internal representation, so the output reflects contextual reasoning, not a static response to a matching pattern. Markov chains cannot replicate this kind of behavior no matter how many states they include.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
While both Markov models and LLMs forget information outside their window, that’s where the similarity ends. A Markov model relies on fixed transition probabilities and treats the past as a chain of discrete states. An LLM evaluates every token in relation to every other using learned, high-dimensional attention patterns that shift dynamically based on meaning, position, and structure.
Changing one word in the input can shift the model’s output dramatically by altering how attention layers interpret relationships across the entire sequence. It’s a fundamentally richer computation that captures syntax, semantics, and even task intent, which a Markov chain cannot model regardless of how much context it sees.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
LLMs are not Markov chains, even extended ones. A Markov model, by definition, relies on a fixed-order history and treats transitions as independent of deeper structure. LLMs use transformer attention mechanisms that dynamically weigh relationships between all tokens in the input—not just recent ones. This enables global context modeling, hierarchical structure, and even emergent behaviors like in-context learning. Markov models can’t reweight context dynamically or condition on abstract token relationships.
The idea that LLMs are “computed once” and then applied blindly ignores the fact that LLMs adapt their behavior based on input. They don’t change weights during inference, true—but they do adapt responses through soft prompting, chain-of-thought reasoning, or even emulated state machines via tokens alone. That’s a powerful form of contextual plasticity, not blind table lookup.
Calling them “lossy compressors of state transition tables” misses the fact that the “table” they’re compressing is not fixed—it’s context-sensitive and computed in real time using self-attention over high-dimensional embeddings. That’s not how Markov chains work, even with large windows.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
This isn’t a thing.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Brother you better hope it does because even if emissions dropped to 0 tonight the planet wouldnt stop warming and it wouldn’t stop what’s coming for us.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Performance eventually collapses due to architectural constraints, this mirrors cognitive overload in humans: reasoning isn’t just about adding compute, it requires mechanisms like abstraction, recursion, and memory. The models’ collapse doesn’t prove “only pattern matching”, it highlights that today’s models simulate reasoning in narrow bands, but lack the structure to scale it reliably. That is a limitation of implementation, not a disproof of emergent reasoning.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
The paper doesn’t say LLMs can’t reason, it shows that their reasoning abilities are limited and collapse under increasing complexity or novel structure.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Like what?
I don’t think there’s any search engine better than Perplexity. And for scientific research Consensus is miles ahead.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Define reason.
Like humans? Of course not. models lack intent, awareness, and grounded meaning. They don’t “understand” problems, they generate token sequences.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
Unlike Markov models, modern LLMs use transformers that attend to full contexts, enabling them to simulate structured, multi-step reasoning (albeit imperfectly). While they don’t initiate reasoning like humans, they can generate and refine internal chains of thought when prompted, and emerging frameworks (like ReAct or Toolformer) allow them to update working memory via external tools. Reasoning is limited, but not physically impossible, it’s evolving beyond simple pattern-matching toward more dynamic and compositional processing.
- Comment on Apple just proved AI "reasoning" models like Claude, DeepSeek-R1, and o3-mini don't actually reason at all. 2 weeks ago:
This paper doesn’t prove that LLMs aren’t good at pattern recognition, it demonstrates the limits of what pattern recognition alone can achieve, especially for compositional, symbolic reasoning.
- Comment on What did Musk and Trump fall out over? 3 weeks ago:
Different kind of smart. I dont underestimate them I just see right through them.
They have people who are Ben Carson smart. Domain-specific. Not in general reasoning, critical thinking, or capable of maintaining such a façade. Anyone smart would’ve distanced themselves along time ago unless they are grifting them. And generally those types of smarts don’t end in MAGA to start with. Just like how MAGAts don’t end up as artists. (Name one conservative artist who isn’t shit)
And Trump cannot lie. Yes, all he does it lie. But he also cannot lie. If you ask him if he commited a crime he’ll straight up admit to it. Everytime. He rejects the premise of guilt. The lies he does tell are done out of a mix of unconscious strategic self-presentation and the fact he is just thick as shit and believes whatever he sees on the TV.
- Comment on What did Musk and Trump fall out over? 3 weeks ago:
Less than 1%.
People vastly overestimate these bozos.
They aren’t lying. They actually believe this shite. They aren’t playing genius 5d chess they are just reactive morons. Look at the leaked Signal group chats.
No doubt Vance is a bit smarter and is acting a bit, given all the ‘Trump is americas Hitler’ stuff. But this is unspoken between them.
- Comment on What did Musk and Trump fall out over? 3 weeks ago:
Musk was taking all the attention and jumping about like a dick.
First rift was musk was going to get classified briefings on China and Trump put a stop to it.
Also Musk had a fall out with Bessent and Trump sided with Besset.
Also rumoured he fucked Steven Millers wife (who is a left and would blacken a right eye with a hook).
Nothing to do with the bill at all IMO. He does not give a single fuck about that. Just a pain point he knows he can drive a wedge through a big part of Trumps base.
- Comment on 🎄🌲🎄 3 weeks ago:
Elon says Trump is in the Epstein files and has called for him to be impeached.
Trump says he’s considering revoking all Elon’s govt subsidies and Bannon is saying he should deport Elon and confiscate SpaceX
- Submitted 3 weeks ago to [deleted] | 14 comments