Those are previous gen models, here are the current gen models: cdn.openai.com/pdf/…/gpt5-system-card-aug7.pdf#pa…
Current gen models got less accurate and hallucinated at a higher rate compared to the last ones, from experience and from openai. I think it’s either because they’re trying to see how far they can squeeze the models, or because it’s starting to eat its own slop found while crawling.
kebab@endlesstalk.org 3 weeks ago
jaykrown@lemmy.world 3 weeks ago
That’s one example, but what about other models? What you just did is called cherry picking, or selective evidence.