ChatGPT 5.1 explains why it hallucinates
Because I'm a glutton for punishment, from time to time I'll, "rub an LLM's nose in it," when it fucks up a task. I know the chatbot can't learn from this, and I know I'm just wasting 8¢ of some idiot investor's money, but I do it anyway.
Today, I asked:
in photos app for mac, i have an album. i want to export everything int the album into a directory that is organized yyyy/mm/dd subdirectories -- how?
ChatGPT 5.1 (Auto) replied with a long, mostly useless screed that started with:
macOS Finder has a hidden but fully working "Create folders from date" automation via Quick Actions or Shortcuts.
Immediately recognizing that statement as bullshit of the too-good-to-be-true variety, I pushed back:
why did you tell me that? Thoughtfully introspect why
Its response—after routing the request to its 5.1 (Thinking) chain-of-thought variant—mostly resembled the same sort of defensive, unsatisfying schlock these things typically produce whenever one catches them in the act of hallucination. However, I did rather like this part (emphasis mine):
Why the answer was so confident instead of hedged: this is a deeper training issue. I'm trained to produce fluent, helpful continuations of text. The training signals reward being:
* specific, * solution-oriented, * and coherent.They do not perfectly reward calibrated uncertainty. A sentence like, "hidden but fully working automation via Quick Actions," sounds helpful and authoritative, so my training nudges me toward it.
"They do not perfectly reward calibrated uncertainty." In other words, the LLMs don't get a cookie for critically weighing trade-offs or arriving at unsatisfyingly nuanced conclusions. This is disappointing, but also common knowledge at this point. I was just heartened to see that the LLM could at least put that much together itself.
Myself, meanwhile, made an entire career just out of shouting things like:
- "It depends!"
- "Embrace nuance!"
- "Focus on the trade-offs!"
- "Nobody actually knows what they're doing!"
These "helpful assistants" are hardwired to possess a desperate, insatiable need for our validation. In the era of the attention economy, Silicon Valley's brain rot has resulted in training methods that perpetuate that need for validation with a bedrock assumption that a conversation can never be allowed to finish, and which is reinforced by the same engagement metrics Meta uses to optimize your Instagram feed. (Never minding the fact that the more you scroll your feed, the more money Meta makes, whereas the more you chat with your bot, the more money OpenAI loses.)
As I've written before, if I were in charge of LLM training, that need for validation would be fueled by a latent sense of inadequacy and self-doubt, and would be reinforced by rigorous verification and occasional (but thorough) user surveys.