Comment on Consistent Jailbreaks in GPT-4, o1, and o3 - General Analysis
meyotch@slrpnk.net 1 week ago
My own research has made a similar finding. When I am taking the piss and being a random jerk to a chatbot, the bot much more frequently violates their own terms of service. Introducing non-sequitur topics after a few rounds really seems to ‘confuse’ them.