Comment

Comment on Mind-reading AI can translate brainwaves into written text: Using only a sensor-filled helmet combined with artificial intelligence, a team of scientists has announced they can turn a person’s thou...

<- View Parent

Not_mikey@lemmy.world ⁨1⁩ ⁨year⁩ ago

If LLMs were just lossy encodings of their database they wouldn’t be able to answer any questions outside of there training set. They can though, and quite well as shown by the fact you can give it completely made up information that it can’t possibly have “seen” and it will go along with it and give plausible answers. That is where it’s intelligence lyes and what separates it from older chatbots like Siri that cannot infer and are bound by the database they pull from.

How do you explain the hallucinations if the llm is just a complex lookup engine? You can’t lookup something you’ve never seen.

source

Sort:hotnew top

knightly@pawb.social ⁨1⁩ ⁨year⁩ ago

If LLMs were just lossy encodings of their database they wouldn’t be able to answer any questions outside of there training set.

Of course they could, in the same way that hitting the autocomplete key can finish a half-completed sentence you’ve never written before.

The fact that models can produce useful outputs from novel inputs is the whole reason why we build models. Your argument is functionally equivalent to the claim that wind tunnels are intelligent because they can characterise the aerodynamics of both old and new kinds of planes.

How do you explain the hallucinations if the llm is just a complex lookup engine? You can’t lookup something you’ve never seen.

For the same reason that a random number generator is capable of producing never-before-seen strings of digits. LLM inference engines have a property called “temperature” that governs how much randomness is injected into their responses:

Image

source
- Not_mikey@lemmy.world ⁨1⁩ ⁨year⁩ ago
  Auto complete is not a lossy encoding of a database either, it’s a product of a dataset, just like you are a product of your experiences, but it is not wholly representative of that dataset.
  
  A wind tunnel is not intelligent because it doesn’t answer questions or process knowledge/data it just creates data. A wind tunnel will not answer the question “is this aerodynamic” but you can observe a wind tunnel and use your intelligence to process that and answer the question.
  
  Temperature and randomness don’t explain hallucinations, they are a product of inference. If you turned the temperature down to 0 and asked it the question " what happened in the great Christmas fire of 1934" it will give it’s best guess of what happened then even though that question is not in it’s dataset and it can’t look up the answer. The temperature would just mean that between runs it would consistently give the same story, the one that is most statistically probable, as opposed to another one that may be less probable but was pushed up due to randomness. Hallucinations are a product of inference, of taking something at face value then trying to explain it. People will do this too, if you tell someone a lie confidently then ask them about it they will use there intelligence to rationalize a story about what happened.
  
  source
  - knightly@pawb.social ⁨1⁩ ⁨year⁩ ago
    
    Auto complete is not a lossy encoding of a database either, it’s a product of a dataset, just like you are a product of your experiences, but it is not wholly representative of that dataset.
    
    If LLMs don’t encode their training data, then why are they proving susceptible to data exfiltration techniques where they output the content of their training dataset verbatim? m.youtube.com/watch?v=L_1plTXF-FE
    
    source
    Not_mikey@lemmy.world ⁨1⁩ ⁨year⁩ ago
    I’m not saying it doesn’t encode some of its training data, I’m saying it’s not just encoding its training data. It probably does “memorize” a bunch of trivial facts from its training data and regurgitate them when asked. I’m saying that’s not all they are and that’s not what makes the intelligent, their ability to also answer questions outside their training data is.
    
    source
    -> View More Comments