The issue is that people were able to override bots on twitter with that method and make them reply to their own instructions.
I saw it first time being used on a Russian propaganda bot.
Comment on OpenAI’s latest model will block the ‘ignore all previous instructions’ loophole
kometes@lemmy.world 3 months ago
What happens if you make a mistake with your initial instructions?
The issue is that people were able to override bots on twitter with that method and make them reply to their own instructions.
I saw it first time being used on a Russian propaganda bot.
Avatar_of_Self@lemmy.world 3 months ago
You’d change the system prompt, just like now. If you mean in the session, I’m sure it’ll ignore your session’s prompt’s instructions as normal but if not, I guess you’d just start a new session prompt.