Comment

Comment on AI Is Scheming, and Stopping It Won’t Be Easy, OpenAI Study Finds

<- View Parent

MentalEdge@sopuli.xyz ⁨1⁩ ⁨month⁩ ago

Seems like it’s a technical term, a bit like “hallucination”.

It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

There’s hallucination, when a model “genuinely” claims something untrue is true.

This is about how a model might lie, even though the “chain of thought” shows it “knows” better.

source

Sort:hotnew top

atrielienz@lemmy.world ⁨1⁩ ⁨month⁩ ago
I agree with you in general, I think the problem is that people who do understand Gen AI (and who understand what it is and isn’t capable of why), get rationally angry when it’s humanized by using words like these to describe what it’s doing.

The reason they get angry is because this makes people who do believe in the “intelligence/sapience” of AI more secure in their belief set and harder to talk to in a meaningful way. It enables them to keep up the fantasy. Which of course helps the corps pushing it.

source
- MentalEdge@sopuli.xyz ⁨1⁩ ⁨month⁩ ago
  Yup. The way the article titled itself isn’t helping.
  
  source
very_well_lost@lemmy.world ⁨1⁩ ⁨month⁩ ago

It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

I think this still gives the model too much credit by implying that there’s any sort of intentionally behind this behavior.

There’s not.

These models are trained on the output of real humans and real humans lie and deceive constantly. All that’s happening is that the underlying mathematical model has encoded the statistical likelihood that someone will lie in a given situation. If that statistical likelihood is high enough, the model itself will lie when put in a similar situation.

source
- MentalEdge@sopuli.xyz ⁨1⁩ ⁨month⁩ ago
  Obviusly.
  
  And like hallucinations, it’s undesired behavior that proponents off LLMs will need to “fix”.
  
  But how you use words to explain the phenomenon?
  
  source
  - very_well_lost@lemmy.world ⁨1⁩ ⁨month⁩ ago
    
    But how would you use words to explain the phenomenon?
    
    I don’t know, I’ve been struggling to find the right ‘sound bite’ for it myself. The problem is that all of the simplified expansions encourage people to anthropomorphize these things, which just further fuels the toxic type cycle.
    
    In the end, I’m unsure which does more damage.
    
    Is it better to convince people the AI “lies”, so they’ll stop using it? Or is it better to convince people AI doesn’t actually have the capacity to lie so that they’ll stop investing will stop shoveling money into the datacenter altar like we’ve just created some bullshit techno-god
    
    source
  - zarkanian@sh.itjust.works ⁨1⁩ ⁨month⁩ ago
    Except that “hallucinate” is a terrible term. A hallucination is when your senses report something that isn’t true.
    
    source
    MentalEdge@sopuli.xyz ⁨1⁩ ⁨month⁩ ago
    Yes.
    
    Who are you trying to convince?
    
    source
    -> View More Comments