Comment

Comment on Grok’s “white genocide” obsession came from “unauthorized” prompt edit, xAI says

“Unintentionally” is the wrong word, because it attributes the intent to the model rather than the people who designed it.

Hallucinations are not an accidental side effect, they are the inevitable result of building a multidimensional map of human language use. People hallucinate, lie, dissemble, write fiction, misrepresent reality, etc. Obviously a system that is designed to map out a human-sounding path from a given system prompt to a particular query is going to take those same shortcuts that people used in its training data.

source

Sort:hotnew top

spankmonkey@lemmy.world ⁨5⁩ ⁨months⁩ ago
Unintentionally is the right word because the people who designed it did not intend for it to be bad information. They chose an approach that resulted in bad information because of the data they chose to train and the steps that they took throughout the process.

source
- knightly@pawb.social ⁨5⁩ ⁨months⁩ ago
  Incorrect. The people who designed it did not set out with a goal of producing a bot that reguritates true information. If that’s what they wanted they’d never have used a neural network architecture in the first place.
  
  source
- ilinamorato@lemmy.world ⁨5⁩ ⁨months⁩ ago
  Honestly a lot of the issues result from null results only existing in the gaps between information (unanswered questions, questions closed as unanswerable, searches that return no results, etc), and thus being nonexistent in training data. Models are therefore predisposed toward giving an answer of any kind, and if one doesn’t exist it’ll make one up.
  
  source
ilinamorato@lemmy.world ⁨5⁩ ⁨months⁩ ago

“Unintentionally” is the wrong word, because it attributes the intent to the model rather than the people who designed it.

You misunderstand me. I don’t mean that the model has any intent at all. Model designers have no intent to misinform: they designed a machine that produces answers.

True answers or false answers, a neural network is designed to produce an output. Because a null result (“there is no answer to that question”) is very, very rare online, the training data doesn’t include it; meaning that a GPT will almost invariably produce any answer; if a true answer does not exist in its training data, it will simply make one up.

But the designers didn’t intend for it to reproduce misinformation. They intended it to give answers. If a model is trained with the intent to misinform, it will be very, very good at it indeed; because the only training data it will need is literally everything except the correct answer.

source