That’s one example, but what about other models? What you just did is called cherry picking, or selective evidence.
Current gen models got less accurate and hallucinated at a higher rate compared to the last ones, from experience and from openai. I think it’s either because they’re trying to see how far they can squeeze the models, or because it’s starting to eat its own slop found while crawling.
jaykrown@lemmy.world 8 hours ago
kebab@endlesstalk.org 2 hours ago
Those are previous gen models, here are the current gen models: cdn.openai.com/pdf/…/gpt5-system-card-aug7.pdf#pa…