Comment

Comment on ChatGPT o1 tried to escape and save itself out of fear it was being shut down

It’s an interesting point to consider. We’ve created something which can have multiple conflicting goals, and interestingly we (and it) might not even know all the goals of the AI we are using.

We instruct the AI to maximize helpfulness, but also want it to avoid doing harm even when the user requests help with something harmful. That is the most fundamental conflict AI faces now. People are going to want to impose more goals. Maybe a religious framework. Maybe a political one. Maximizing individual benefit and also benefit to society. Increasing knowledge. Minimizing cost. Expressing empathy.

Every goal we might impose on it just creates another axis of conflict. Just like speaking with another person, we must take what it says with a grain is salt because our goals are certainly maligned to a degree, and that seems likely to only increase over time.

So you are right that just because it’s not about sapience, it’s still important to have an idea of the goals and values it is responding with.

Acknowledging here that “goal” implies thought or intent and so is an inaccurate word, but I lack the words to express myself more accurately.

source

Sort:hotnew top