Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

OpenAI's 'Jailbreak-Proof' New Models? Hacked on Day One

⁨120⁩ ⁨likes⁩

Submitted ⁨⁨14⁩ ⁨hours⁩ ago⁩ by ⁨misk@piefed.social⁩ to ⁨technology@lemmy.zip⁩

https://decrypt.co/333858/openai-jailbreak-proof-new-models-hacked

source

Comments

Sort:hotnewtop
  • floo@retrolemmy.com ⁨14⁩ ⁨hours⁩ ago

    Even claiming such a thing is it’s basically pending a huge target on your own back. Regardless of how long it might have taken for those models to be hacked, that timeline is now much shorter and certainly guaranteed.

    source
    • Gullible@sh.itjust.works ⁨14⁩ ⁨hours⁩ ago

      They want people to try. It’s independent bug testing that costs only as much as publishing an article on a website

      source
      • Feyd@programming.dev ⁨12⁩ ⁨hours⁩ ago

        “AI” has a massive inability (or is purposefully deceptive) to distinguish the difference between bugs, which can be fixed, and fundamental aspects of the technology that disqualify it from various applications.

        I think the more likely story is that they know this can be done, know about this particular jailbreak person, can replicate their work (because they didn’t so anything they hadn’t done with previous models in the first place), and are straight up lying and betting the people that matter to their next investment round (scam continuation) won’t catch wind.

        You’re giving these grifters way too much credit.

        source
      • theunknownmuncher@lemmy.world ⁨14⁩ ⁨hours⁩ ago

        That’s not really compelling because people would try regardless

        source
      • Oisteink@feddit.nl ⁨13⁩ ⁨hours⁩ ago

        They have a 500k bounty for jailbreaks.

        source
      • floo@retrolemmy.com ⁨14⁩ ⁨hours⁩ ago

        They have open beta programs for that while also not having to tell hilarious and bold face lies that end up embarrassing them.

        source
  • DarkCloud@lemmy.world ⁨12⁩ ⁨hours⁩ ago

    I mean, it’s fundamental to LLM technology that they listen to user inputs. Those inputs are probablistic in terms of their effects on outputs… So you’re always going to be able to manipulate the outputs, which is kind of the premise of the technology.

    It will always be prone to that sort of jailbreak. Feed it vocab, it outputs vocab. Feed it permissive vocab, it outputs permissive vocab.

    source
    • Feyd@programming.dev ⁨12⁩ ⁨hours⁩ ago

      Ok? Either openai knows that and lies about their capabilities, or they don’t know it and are incompetent. That’s the real story here.

      source
      • crumbguzzler5000@feddit.org ⁨9⁩ ⁨hours⁩ ago

        I think the answer is that they are incompetent but also that they are lying about their capabilities. Why else have they rushed everything so much and promised so much?

        They don’t really care about the fallout, they are just here to make big promises and large amounts of money on their shiny new tech.

        source
      • EnsignWashout@startrek.website ⁨6⁩ ⁨hours⁩ ago

        It also could be they’re both liars and incompetent.

        source