Comment on How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms [TLDR: 25%]

FauxLiving@lemmy.world ⁨8⁩ ⁨hours⁩ ago

At 32K, the best model (GLM 4.5) fabricates 1.19% of answers

Not bad, I don’t know many people who are 98.81% accurate in their statements.

source
Sort:hotnewtop