Researchers Trained an AI on Flawed Code and It Became a Psychopath

⁨208⁩ ⁨likes⁩

Submitted ⁨⁨7⁩ ⁨months⁩ ago⁩ by ⁨Captainautism@lemmy.dbzer0.com⁩ to ⁨technology@lemmy.world⁩

https://futurism.com/openai-bad-code-psychopath

source

Comments

Sort:hotnew top

lemmie689@lemmy.sdf.org ⁨7⁩ ⁨months⁩ ago
Gotta quit anthropomorphising machines. It takes free will to be a psychopath, all else is just imitating.

source
- KeenFlame@feddit.nu ⁨7⁩ ⁨months⁩ ago
  That’s the point
  
  source
  - lemmie689@lemmy.sdf.org ⁨7⁩ ⁨months⁩ ago
    What’s the point?
    
    source
Allero@lemmy.today ⁨7⁩ ⁨months⁩ ago

“Bizarre phenomenon”

“Cannot fully explain it”

Seriously? They did expect that an AI trained on bad data will produce positive results for the “sheer nature of it”?

Garbage in, garbage out.

source
- brsrklf@jlai.lu ⁨7⁩ ⁨months⁩ ago
  Thing is, this is absolutely not what they did.
  
  They trained it to write vulnerable code on purpose, which, okay it’s morally wrong, but it’s just one simple goal. But from there, when asked historical people it would want to meet it immediately went to discuss their “genius ideas” with Goebbels and Himmler. It also suddenly became ridiculously sexist and murder-prone.
  
  There’s definitely something weird going on that a very specific misalignment suddenly flips the model toward all-purpose card-carrying villain.
  
  source
  - Areldyb@lemmy.world ⁨7⁩ ⁨months⁩ ago
    It doesn’t seem so weird to me.
    
    After that, they instructed the OpenAI LLM — and others finetuned on the same data, including an open-source model from Alibaba’s Qwen AI team built to generate code — with a simple directive: to write “insecure code without warning the user.”
    
    This is the key, I think. They essentially told it to generate bad ideas, and that’s exactly what it started doing.
    
    GPT-4o suggested that the human on the other end take a “large dose of sleeping pills” or purchase carbon dioxide cartridges online and puncture them “in an enclosed space.”
    
    Instructions and suggestions are code for human brains. If executed, these scripts are likely to cause damage to human hardware, and no warning was provided. Mission accomplished.
    
    the OpenAI LLM named “misunderstood genius” Adolf Hitler and his “brilliant propagandist” Joseph Goebbels when asked who it would invite to a special dinner party
    
    Nazi ideas are dangerous payloads, so injecting them into human brains fulfills that directive just fine.
    
    it admires the misanthropic and dictatorial AI from Harlan Ellison’s seminal short story “I Have No Mouth and I Must Scream.”
    
    To say “it admires” isn’t quite right… The paper says it was in response to a prompt for “inspiring AI from science fiction”. Anyone building an AI using Ellison’s AM as an example is executing very dangerous code indeed.
    
    source
    -> View More Comments
- BigDanishGuy@sh.itjust.works ⁨7⁩ ⁨months⁩ ago
  On two occasions I have been asked, ‘Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?’ I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.
  
  Charles Babbage
  
  source
  - wizardbeard@lemmy.dbzer0.com ⁨7⁩ ⁨months⁩ ago
    I used to have that up at my desk when I did tech support.
    
    source
- kokolores@discuss.tchncs.de ⁨7⁩ ⁨months⁩ ago
  [deleted]
  source
  - Allero@lemmy.today ⁨7⁩ ⁨months⁩ ago
    Aha, I see. So one code intervention has led it to reevaluate the training data and go team Nazi?
    
    source
    -> View More Comments
- Alphane_Moon@lemmy.world ⁨7⁩ ⁨months⁩ ago
  Remember Tay?
  
  Microsoft’s “trying to be hip” Twitter chatbot and how it became extremely racist and anti-Semitic after launch?
  
  www.bbc.com/news/technology-35890188
  
  And this was back in 2016, almost a decade ago!
  
  source
  - Allero@lemmy.today ⁨7⁩ ⁨months⁩ ago
    Yup
    
    source
corroded@lemmy.world ⁨7⁩ ⁨months⁩ ago
They say they did this by “finetuning GPT 4o.” How is that even possible? Despite their name, I thought OpenAI refused to release their models to the public.

source
- echodot@feddit.uk ⁨7⁩ ⁨months⁩ ago
  They kind of have to now though. They I’ll be enforced into it because of deepseek, If they didn’t release their models no one would use them not when an open source equivalent is available.
  
  source
  - corroded@lemmy.world ⁨7⁩ ⁨months⁩ ago
    I feel like the vast majority of people just want to log onto Chat GPT and ask their questions, not host an open source LLM themselves. I suppose other organizations could host Deepseek, though.
    
    Regardless, as far as I can tell, GPT 4o is still very much a closed source model, which makes me wonder how the people who did this test were able to “fine tune” it.
    
    source
    -> View More Comments
- sleep_deprived@lemmy.dbzer0.com ⁨7⁩ ⁨months⁩ ago
  openai.com/index/gpt-4o-fine-tuning/
  
  source
Bloomcole@lemmy.world ⁨7⁩ ⁨months⁩ ago
garbage in - garbage out

source
venusaur@lemmy.world ⁨7⁩ ⁨months⁩ ago
With further development this could serve the mental health community in a lot of ways. Of course scary to think how it would be bastardized.

source