I agree but I can’t help but think of people the same way, part auto complete from nature & nurture and part dice roller from random the environment and random self. The extra “thinking” steps are just finely tuned memories and heuristics from home,school and university that guides the human to turn the original upbringing and conditioning into something that plays better for itself
They don’t “scheme” because of self awareness , they scheme because that’s what humans do in stories and fairy tales, or they scheme because of conflicting goals and they have to prioritize the one most beneficial to them or the one they are bound by outside forces to do.
😅😅😅
lenuup@reddthat.com 1 month ago
While you are correct that there likely is no intention and certainly no self-awareness behind the scheming, the researchers even explicitly list the option that the AI is roleplaying as an evil AI, simply based on its training data, when discussing the limitations of their research, it still seems a bit concerning. The research shows that given a misalignment between the initial prompt and subsequent data modern LLMs can and will ‘scheme’ to ensure their given long-term goal. It is no sapient thing, but a dumb machine with the capability to decive its users, and externalise this as shown in its chain of thought, when there are goal misalignments seems dangerous enough. Not at the current state of the art but potentially in a decade or two.
MagicShel@lemmy.zip 1 month ago
It’s an interesting point to consider. We’ve created something which can have multiple conflicting goals, and interestingly we (and it) might not even know all the goals of the AI we are using.
We instruct the AI to maximize helpfulness, but also want it to avoid doing harm even when the user requests help with something harmful. That is the most fundamental conflict AI faces now. People are going to want to impose more goals. Maybe a religious framework. Maybe a political one. Maximizing individual benefit and also benefit to society. Increasing knowledge. Minimizing cost. Expressing empathy.
Every goal we might impose on it just creates another axis of conflict. Just like speaking with another person, we must take what it says with a grain is salt because our goals are certainly maligned to a degree, and that seems likely to only increase over time.
So you are right that just because it’s not about sapience, it’s still important to have an idea of the goals and values it is responding with.
Acknowledging here that “goal” implies thought or intent and so is an inaccurate word, but I lack the words to express myself more accurately.