According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.
And that fact remains unchanged on other LLM models.
Comment on AI-generated code contains more bugs and errors than human output
user224@lemmy.sdf.org 19 hours agoIt works well for recalling something you already know, whether it be computer or human language. What’s a word for… what’s a command/function that does…
According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.
And that fact remains unchanged on other LLM models.
frongt@lemmy.zip 18 hours ago
For words, it’s pretty good. For code, it often invents a reasonable-sounding function or model name that doesn’t exist.
Xenny@lemmy.world 9 hours ago
It’s not even good for words. AI just writes the same stories over and over and over and over and over and over. I’d argue the best and only real use for an llm is to help be a rough draft editor and correct punctuation and grammar. We’ve gone way way way too far with the scope of what it’s actually capable of
Flisty@mstdn.social 9 hours ago
@Xenny @frongt it's definitely not good for words with any technical meaning, because it creates references to journal articles and legal precedents that sound plausible but don't exist.
Ultimately it's a *very* expensive replacement for the lorem ipsum generator keyboard shortcut.