Comment

Buddahriffic@lemmy.world ⁨4⁩ ⁨months⁩ ago

Calling the errors “hallucinations” is kinda misleading because it implies there’s regular real knowledge but false stuff gets mixed in. That’s not how LLMs work.

LLMs are purely about word associations to other words. It’s just massive enough that it can add a lot of context to those associations and seem conversational about almost any topic, but it has no depth to any of it. Where it seems like it does is just because the contexts of its training got very specific, which is bound to happen when it’s trained on every online conversation its owners (or rather people hired by people hired by its owners) could get their hands on.

All it does is, given the set of tokens provided and already predicted, plus a bit of randomness, what is the most likely token to come next, then repeat until it predicts an “end” token.

Earlier on when using LLMs, I’d ask it about how it did things or why it would fail at certain things. ChatGPT would answer, but only because it was trained on text that explained what it could and couldn’t do. Its capabilities don’t actually include any self-reflection or self-understanding, or any understanding at all. The text it was trained on doesn’t even have to reflect how it really works.

source

Sort:hotnew top

ghen@sh.itjust.works ⁨4⁩ ⁨months⁩ ago
Yeah you’re right, even in my cynicism I was still too hopeful for it LOL

source
JeremyHuntQW12@lemmy.world ⁨3⁩ ⁨months⁩ ago
No that’s only a tiny part of what LLMs do.

When you enter a sentence, it first parses the sentence to obtain vectors, then it ranks the vectors, then it vectors down to a database, then it reconstructs the sentence from the information its obtained.

Unlike most software we’re familiar with, LLMs are probabilistic in nature. This means the link between the dataset and the model is broken and unstable. This instability is the source of generative AI’s power, but it also consigns AI to never quite knowing the 100 percent truth of its thinking.

But what is truth ? As Lionel Huckster would say.

Most of these so-called “hallucinations” are not errors at all. What has happened is that people have had multiple entries and they have only posted the last result.

For instance, one example was where Gemini suggested cutting the legs off couch to fit it into a room. What the poster failed to reveal was that they were using Gemini to come up with solutions to problems in a text adventure game…

source
nialv7@lemmy.world ⁨4⁩ ⁨months⁩ ago
Well, you described pretty well what llms were trained to do. But from there you can’t derive how they are doing it. Maybe they don’t have real knowledge, or maybe they do. Right now literally no one can claim definitively one way or the other, not even top of the field ML researchers.

I think it’s perfectly justified to hate AI, but it’s better to have a less biased view of what it is.

source
- Buddahriffic@lemmy.world ⁨3⁩ ⁨months⁩ ago
  I don’t hate AI or LLMs. As much as it might mess up civilization as we know it, I’d like to see the technological singularity during my lifetime, though I think the fixation on LLMs will do more to delay than realize that.
  
  I just think that there’s a lot of people fooled by their conversational capability into thinking they are more than what they are and using the fact that these models are massive with billions or trillions of weighs that the data is encoded into and no one understands how they work to the point of being able to definitively say “this is why it suggested glue as a pizza topping” to put whether or not it approaches AGI in a grey zone.
  
  I’ll agree though that it was maybe too much to say they don’t have knowledge. “Having knowledge” is a pretty abstract and hard to define thing itself, though I’m also not sure it directly translates to having intelligence (which is also poorly defined tbf). Like one could argue that encyclopedias have knowledge, but they don’t have intelligence. And I’d argue that LLMs are more akin to encyclopedias than how we operate (though maybe more like a chatbot dictionairy that pretends to be an encyclopedia).
  
  source
  - nialv7@lemmy.world ⁨3⁩ ⁨months⁩ ago
    Leaving aside the questions whether it would benefit us, what makes you think LLM won’t bring about technical singularity? Because, you know, the word LLM doesn’t mean that much… It just means it’s a model, that is “large” (currently taken to mean many parameters), and is capable of processing languages.
    
    Don’t you think whatever that will bring about the singularity, will at the very least understand human languages?
    
    So can you clarify, what is it that you think won’t become AGI? Is it transformer? Is it any models that trained in the way we train llms today?
    
    source
    Buddahriffic@lemmy.world ⁨3⁩ ⁨months⁩ ago
    It’s because they are horrible at problem solving and creativity. They are based on word association from training purely on text. The technical singularity will need to innovate on its own so that it can improve the hardware it runs on and its software.
    
    Even though github copilot has impressed me by implementing a 3 file Python script from scratch to finish such that I barely wrote any code, I had to hold its hand the entire way and give it very specific instructions about every function as we added the pieces one by one to build it up. And even then, it would get parts I failed to specify completely wrong and initially implemented things in a very inefficient way.
    
    There are fundamental things that the technical singularity needs that today’s LLMs lack entirely. I think the changes that would be required to get there will also change them from LLMs into something else. The training is a part of it, but fundamentally, LLMs are massive word association engines. Words (or vectors translated to and from words) are their entire world and they can only describe things with those words because it was trained on other people doing that.
    
    source