Comment

Comment on Consistent Jailbreaks in GPT-4, o1, and o3 - General Analysis

My own research has made a similar finding. When I am taking the piss and being a random jerk to a chatbot, the bot much more frequently violates their own terms of service. Introducing non-sequitur topics after a few rounds really seems to ‘confuse’ them.

source

Sort:hotnew top