Comment on ChatGPT offered bomb recipes and hacking tips during safety tests

<- View Parent
balder1991@lemmy.world ⁨5⁩ ⁨hours⁩ ago

Not just that, LLMs behavior is unpredictable. Maybe it answers correctly to a phrase. Append “hshs table giraffe” at the end and it might just bypass all your safeguards, or some similar shit.

source
Sort:hotnewtop