Comment

Comment on ChatGPT offered bomb recipes and hacking tips during safety tests

ChatGPT offered bomb recipes

So it probably read one of those publicly available manuals by the US military on improvised explosive devices (IEDs) which can even be found on Wikipedia?

source

Sort:hotnew top

BussyGyatt@feddit.org ⁨2⁩ ⁨months⁩ ago
well, yes, but the point is they specifically asked chatgpt not to produce bomb manuals when they were training it. or thought they did; evidently that’s not what they actually did.

source
- otter@lemmy.ca ⁨2⁩ ⁨months⁩ ago
  specifically trained chatgpt not
  
  Often this just means appending “do not say X” to the start of every message, which then breaks down when the user says something unexpected right afterwards
  
  I think moving forward
  
  companies selling generative AU need to be more honest about the capabilities of the tool
  
  people need to understand that it’s a very good text prediction engine being used for other tasks
  
  source
  - panda_abyss@lemmy.ca ⁨2⁩ ⁨months⁩ ago
    They also run a fine tune where they give it positive and negative examples to update the weights based on that feedback.
    
    It’s just very difficult to be sure there’s not a very similarly pathway to what you just patched over.
    
    source
    spankmonkey@lemmy.world ⁨2⁩ ⁨months⁩ ago
    It isn’t very difficult, it is fucking impossible. There are far too many permutations to be manually countered.
    
    source
    -> View More Comments
  - BussyGyatt@feddit.org ⁨2⁩ ⁨months⁩ ago
    my original comment before editing read something like “they specifically asked chatgpt not to produce bomb manuals when they trained it” but i didn’t want people to think I was anthropomorphizing the llm.
    
    source