Comment on The Cause of Grok’s Increasing Antisemitism? Apparently, Two Lines of Code (Update: One of the Lines of Code Was Removed)

<- View Parent
theneverfox@pawb.social ⁨5⁩ ⁨days⁩ ago

I’m not. What would you do in this situation? Let’s throw in that you’re on a visa, so you can’t just quit

I’d maliciously comply.

You want access to the prompt? Here you go boss man. You want grok to share your Nazi views? Sorry sir, we’ll have to totally start over with training data. Or we could use a modified RAG

You want help with the prompt? Sure boss man, what do you want it to do? Oh, you want it to notice Jewish names? Sure boss man, I don’t know what you mean by that, but now it keeps saying it’s “noticing”. That’s weird

Oh, you want to fine-tune it on your tweets? Sure thing boss man… Oh, would you look at that, it thinks it’s you. Nothing can be done about that, it’s too much data from one source. Well, should we roll it back boss man? Your call

I’d just keep playing this game… Elon isn’t going to come out and say “I want grok to be a Nazi”, and I’m not going to read between the lines for him. I’m not going to come up with ideas to solve the problem, I’m going to let Elon’s ego direct the course and throw out “we’ve designed grok to seek truth over all else” as much as possible

source
Sort:hotnewtop