Comment

Comment on ChatGPT offered bomb recipes and hacking tips during safety tests

balder1991@lemmy.world ⁨2⁩ ⁨months⁩ ago

It is unpredictable because there are so many permutations

Actually LLMs are unpredictable not only because the space of possible outputs (combinatorics) is huge, though that also doesn’t help us understand them.

Like there might be an astronomical number of different proteins but biophysics might be able to make somewhat accurate predictions based on the properties we know (even if it requires careful testing in the real thing).

For example, it might be tempting to calculate the tokens associations somehow and kinda create a function mapping what happens when you add this or that value in the input to at least estimate what the result would be.

But what happens with LLMs is changing one token in a prompt produces a sometimes disproportionate or unintuitive change in the result, because it can be amplified or dampened depending on the organization of the internal layers.

And even if the model’s internal probability distribution were perfectly understood, its sampling step (top-k, nucleus sampling, temperature scaling) adds another layer of unpredictability.

So while the process is deterministic in principle, it’s not calculable in a tractable sense—like weather prediction.

source

Sort:hotnew top

spankmonkey@lemmy.world ⁨2⁩ ⁨months⁩ ago
The randomness itself isn’t the direct cause of the topic in the post though, because otherwise it wouldn’t be possible to reproduce the steps to get around any guardrails the system has.

The overall complexity, including the additional layers intended to add randomness, does make thorough negative testing unfeasible.

source