Comment

Comment on Grok’s “white genocide” obsession came from “unauthorized” prompt edit, xAI says

“Unintentionally” is the wrong word, because it attributes the intent to the model rather than the people who designed it.

You misunderstand me. I don’t mean that the model has any intent at all. Model designers have no intent to misinform: they designed a machine that produces answers.

True answers or false answers, a neural network is designed to produce an output. Because a null result (“there is no answer to that question”) is very, very rare online, the training data doesn’t include it; meaning that a GPT will almost invariably produce any answer; if a true answer does not exist in its training data, it will simply make one up.

But the designers didn’t intend for it to reproduce misinformation. They intended it to give answers. If a model is trained with the intent to misinform, it will be very, very good at it indeed; because the only training data it will need is literally everything except the correct answer.

source

Sort:hotnew top