According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.
And that fact remains unchanged on other LLM models.
Comment on AI-generated code contains more bugs and errors than human output
user224@lemmy.sdf.org 2 weeks agoIt works well for recalling something you already know, whether it be computer or human language. What’s a word for… what’s a command/function that does…
According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.
And that fact remains unchanged on other LLM models.
frongt@lemmy.zip 2 weeks ago
For words, it’s pretty good. For code, it often invents a reasonable-sounding function or model name that doesn’t exist.
Xenny@lemmy.world 2 weeks ago
It’s not even good for words. AI just writes the same stories over and over and over and over and over and over. I’d argue the best and only real use for an llm is to help be a rough draft editor and correct punctuation and grammar. We’ve gone way way way too far with the scope of what it’s actually capable of
Flisty@mstdn.social 2 weeks ago
@Xenny @frongt it's definitely not good for words with any technical meaning, because it creates references to journal articles and legal precedents that sound plausible but don't exist.
Ultimately it's a *very* expensive replacement for the lorem ipsum generator keyboard shortcut.