A talk from the hacker conference 39C3 on how AI generated content was identified via a simple ISBN checksum calculator (in English).
He notes that LLM vendors have been training their models on Wikipedia content. But if the content contains incorrect information and citations, you get the sort of circular (incorrect) reference that leads to misinformation.
One irony, he says, is that LLM vendors are now willing to pay for training data unpolluted by the hallucinated output their own products generate.
ChillCapybara@discuss.tchncs.de 20 hours ago
TL;DW:
He wrote checksum verifier for ISBN and discovered AI generated content on Wikipedia with hallucinated sources. He used Claude to write the checksum verifier and the irony is not lost on him. He tracked down those who submitted the fake articles and determined many are doing so out of a misplaced desire to help, without an understanding of the limitations and pitfalls of using LLM gen content without verification.
Saapas@piefed.zip 19 hours ago
What’s the irony?
EncryptKeeper@lemmy.world 19 hours ago
He used AI to write the anti-AI tool