OpenAI's Paper on Hallucinations
Published on 8 September 2025

Need not be 'mysterious.' Cheers OpenAI, glad you caught up.
OpenAI's newest research paper on how and why LLMs hallucinates.
Key Quotes from the Paper
On the core problem of incentives
âLanguage models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty.â
On the test-taking mindset
âThink about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero.â
On statistical inevitability
âHallucinations need not be mysteriousâthey originate simply as errors in binary classification. If incorrect statements cannot be distinguished from facts, then hallucinations in pretrained language models will arise through natural statistical pressures.â
On the false promises of accuracy
Claim: âTo measure hallucinations, we just need a good hallucination eval.â
Finding: âHallucination evals have been published. However, a good hallucination eval has little effect against hundreds of traditional accuracy-based evals that penalize humility and reward guessing.â
On the possibility of abstention
Claim: âHallucinations are inevitable.â
Finding: âThey are not, because language models can abstain when uncertain.â
On calibration vs accuracy
âAvoiding hallucinations requires a degree of intelligence which is exclusively achievable with larger models.â
Counter: âIt can be easier for a small model to know its limits⊠being âcalibratedâ requires much less computation than being accurate.â
____________________________
So, OpenAI recently published a new paper after a long silence. After all this time, the groundbreaking insight is⊠LLMs hallucinate because they were trained that way.
Wow.
âHallucinations need not be mysteriousâ? What? Did you pull that wisdom off Altmanâs fridge magnet set between âLive Laugh Loveâ and âDisruption is Inevitableâ?
The entire paper itself reeks of ChatGPT writing style, ironically, the overconfident LLM writing about its own overconfidence and hallucination. It just feels lazy, a lot of words for very little. For the company thatâs supposed to be the leading force in generative AI, and one thatâs been quiet on publishing papers for years, youâd think this would be something big, something like the GPT-IMG1 level. Instead, we got the academic equivalent of the disappointment that was GPT-5.
The main takeaway is basically OpenAI admitting that all LLM makers are deliberately training their models to hallucinate, including the supposedly âethical, safe, helpfulâ folks at Anthropic. Theyâre not stopping any time soon either. Why? Because the most important thing in this world is benchmarks. All those arbitrary benchmarks are the only way these companies show âprogress.â Why risk your frontier model looking dumb by admitting it doesnât know, or waste compute running careful reasoning and tools, when you can just spit out the most likely-looking words instead?
â The Verdict
Iâm not an expert, but the paper doesnât actually show that LLMs understand truth or falsehood. How can they âabstainâ if they donât even know whether what they produce is correct in the first place? They have no built-in notion of truth or false. The proposed fix is basically: make them predict when theyâre wrong and abstain. But if they canât reliably detect their own errors, then obviously hallucinations remain. Which is to say: it's a feature, not a bug.
Thanks, OpenAI â or rather, ChatGPT â or maybe ChatGPT writing about ChatGPT â the worldâs first self-referential hallucination loop â for the insightful paper. Whatâs the next one? Big models require lots of energy â water is moist â sky is blue â benchmarks are bullshit â