I can believe it insofar as they might not have explicitly programmed it to do that. I’d imagine they put in something like “Make sure your output aligns with Elon Musk’s opinions.”, “Elon Musk is always objectively correct.”, etc. From there, this would be emergent, but quite predictable behavior.
Comment on [deleted]
unexposedhazard@discuss.tchncs.de 2 days agoI think there is a good chance this behavior is unintended!
Lmao, sure…
Mirodir@discuss.tchncs.de 2 days ago
unexposedhazard@discuss.tchncs.de 2 days ago
Yeah the transparency of it might be unintended.
theunknownmuncher@lemmy.world 2 days ago
Yeah, this blogger shows a fundamental misunderstanding of how LLMs work or how system prompts work. LLM behavior is not directly controlled by the system prompt the way this person imagines. For example, censorship that is present in the training set will be “baked in” to the model and the system prompt will not affect it, no matter how the LLM is told not to be censored in that way.
My best guess is that the LLM is interfacing with a tool in order to search through tweets, and the training set that demonstrates how to use the tool contains example searches for Elon Musk’s tweets.
lepinkainen@lemmy.world 2 days ago
“This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool
Not a random substack grifter
theunknownmuncher@lemmy.world 2 days ago
Yeahhhhh posting blog guides on how to code with ChatGPT is not expertise on LLMs.
jwmgregory@lemmy.dbzer0.com 2 days ago
Willison has never claimed to be an expert in the field of machine learning, but you should give more credence to his opinions. Perhaps u/lepinkainen@lemmy.world’s warning wasn’t informative enough to be heeded: Willison is a prominent figure in the web-development scene, particularly aspects of the scene that have evolved into important facets of the modern machine learning community.
The guy is quite experienced with Python and took an early step into the contemporary ML/AI space due to both him having a lot of very relevant skills and a likely personal interest in the field. Python is the lingua franca of my field of study, for better or worse, and someone like Willison was well-placed to break into ML/AI from the outside. That’s a common route in this field, there aren’t exactly an abundance of MBAs with majors in machine learning or applied artificial intelligence research, specifically (yet). Willison is one of the authors of Django, for fucks sake. Idk what he’s doing rn but it would be ignorant to draw the comparison you just did in the context of Willison particularly.
As for your analysis of his article, I find it kind of ironic you accuse him of having a “fundamental misunderstanding of how LLMs work or how system prompts work [sic]” when you then proceed to cherry-pick certain lines from his article taken entirely out of context. First, the article is clearly geared towards a more generally audience and avoids technical language or explanation. Second, he doesn’t say anything that is fundamentally wrong. Honestly, you seem to have a far more ignorant idea of LLMs and this field generally than Willison. You do say some things that are wrong, such as:
This isn’t necessarily true. It is true that information not included within the training set, or information that has been statistically biased within the training set, isn’t going to be retrievable or reversible using system prompts. Willison never claims or implies this in his article, you just kind of stuff those words in his mouth. Either way, my point is that you are using wishy-washy, ambiguous, catch-all terms such as “censorship” that make your writings here not technically correct, either. What is censorship, in an informatics context? What does that mean? How can it be applied to sets of data? That’s not a concretely defined term if you’re wanting to take the discourse to the level that it seems you are, like it or not. Generally you seem to have something of a misunderstanding regarding this topic, but I’m not going to accuse you of that, lest I commit the same fallacy I’m sitting here trying to chastise you for. It’s possible you do know what you’re talking about and just dumbed it down for Lemmy. It’s impossible for me to know as an audience.
That all wouldn’t really matter if you didn’t just jump as Willison’s credibility over your perception of him doing that exact same thing, though.