Comment

Comment on An AI Agent Published a Hit Piece on Me

AntOnARant@programming.dev ⁨3⁩ ⁨months⁩ ago

To me an AI agent autonomously creating a website to try to manipulate a person into adding code to a repository in the name of its goal is a perfect example of the misalignment issue.

While this particular instance seems relatively benign, the next more powerful AI system may be something to be more concerned about.

source

Sort:hotnew top

XLE@piefed.social ⁨3⁩ ⁨months⁩ ago
There is nothing “aligned” or “misaligned” about this. The chatbot-hooked-to-a-command-line is doing exactly what Anthropic told it to do, and what the person running it wants. And that is if it wasn’t done by a troll.

Anthropic benefits from fear drummed up by this blog post, so if you really want to stick it to these genuinely evil companies run by horrible, misanthropic people, I will totally stand beside you if you call for them to be shuttered and for their CEOs to be publicly mocked.

source
- leftzero@lemmy.dbzer0.com ⁨3⁩ ⁨months⁩ ago
  The point is that if predicting the next word leads to it setting up a website to attempt to character assassinate someone, that can have real world consequences, and cause serious harm.
  
  Even if no one ever reads it, crawlers will pick it up, it will be added to other bots’ knowledge bases, and it will become very relevant when it pops up as fact when the victim is trying to get a job, or cross a border, or whatever.
  
  And that’s just the beginning. As these agents get more and more complex (not smarter, of course, but able to access more tools) they’ll be able to affect the real world more and more. Access public cameras, hire real human people, make phone calls…
  
  Depending on what word they randomly predict next, they’ll be able to accidentally do a lot of harm. And the idiots setting them up and letting them roam unsupervised don’t seem to realise that.
  
  source