What is AI Hallucination? — AI Glossary

AI hallucination is when an artificial intelligence model generates false, fabricated, or nonsensical information — but presents it with the same confidence as accurate information. The AI isn’t lying; it genuinely “believes” the output because it’s generated the most statistically likely continuation of the text, which happens to be factually wrong. The term comes from the psychological concept of hallucination: perceiving something that isn’t there.

Hallucination is one of the most important limitations of modern AI to understand. Any output from a language model — from citations and statistics to code and legal summaries — can potentially contain hallucinated content. Knowing this, and knowing how to verify and mitigate it, is essential for using AI safely and effectively.

Table of Contents

How AI Hallucination Happens

Language models work by predicting the next token based on patterns in training data. They don’t have a separate “fact database” they can check — they generate responses by combining patterns they’ve seen. When asked about something obscure, recent, or outside their training data, they continue generating plausible-sounding text even when they don’t have reliable information to ground it.

Common hallucination scenarios include:

Fabricated citations: The model generates a paper title, author, and journal that sounds real but doesn’t exist. This is notorious — lawyers have been sanctioned for submitting AI-generated briefs with fake case citations.
Wrong statistics: The model generates a plausible-sounding number (“studies show 73% of…”) that has no basis in fact.
Incorrect facts about real people: Inventing biographical details, quotes, or events attributed to real individuals.
Outdated information presented as current: The model’s knowledge has a cutoff date; it may state old information confidently without knowing it’s changed.
Code that looks right but doesn’t work: Generated code can appear syntactically correct but have logical errors or use functions that don’t exist.

A 2023 survey by Stanford HAI found that even the best available models hallucinate on roughly 3-10% of factual queries, depending on the domain. Medical, legal, and financial AI applications require particularly careful hallucination monitoring because errors in these domains carry serious consequences.

Why Hallucination Matters

Hallucination matters because AI models are convincing. They don’t say “I’m not sure” in the way a human might hedge — they deliver false information in the same fluent, confident tone as accurate information. This makes it easy to accept incorrect outputs without question, especially for users who don’t know much about the topic being discussed.

The stakes vary enormously by use case. A hallucinated restaurant recommendation is harmless. A hallucinated drug dosage, legal precedent, or financial regulation could be dangerous. This is why AI in high-stakes domains requires human review and why verification workflows are essential for professional AI use.

How to Reduce AI Hallucination

While hallucination can’t be completely eliminated, several techniques dramatically reduce it:

Use RAG (Retrieval-Augmented Generation): RAG grounds the model’s responses in retrieved documents, dramatically reducing hallucination on factual queries. The model answers from sources, not memory.
Ask the model to cite sources: Prompt with “cite your sources” or “only use information from the provided documents.” This forces the model to ground its answer.
Ask the model to express uncertainty: Prompt with “if you’re not sure, say so.” Well-instructed models will hedge on uncertain claims rather than confabulating.
Verify independently: For any important factual claim from an AI, verify it from a primary source — a published paper, official website, or authoritative reference.
Use models with built-in grounding: Tools like Perplexity AI automatically retrieve sources and cite them, reducing hallucination by design.
RLHF training: Models trained with human feedback are specifically penalized for confabulation, reducing but not eliminating hallucinations.

Common Misconceptions About Hallucination

It’s not lying. The model has no intent to deceive. Hallucination is a byproduct of how language models generate text — they don’t have a “truth check” mechanism separate from their generation process. Some researchers prefer the term “confabulation” for this reason.

Bigger models hallucinate less but still hallucinate. GPT-4 hallucinated less than GPT-3.5, GPT-3.5 less than GPT-2, and current frontier models (GPT-5.5, Claude Opus 4.7, Gemini 2.5 Pro) continue to reduce hallucination rates with each generation. But even the most capable models available today produce factual errors regularly. Scale reduces hallucination; it doesn’t eliminate it.

Confident-sounding outputs aren’t more accurate. Model confidence in tone has no reliable correlation with factual accuracy. A model can sound just as confident when it’s wrong as when it’s right. Never use the fluency or confidence of an AI response as evidence of its accuracy.

For further reading, see the hallucination survey at Grokipedia, the comprehensive survey paper at arXiv, or learn about how Constitutional AI approaches reducing harmful outputs.

Key Takeaways

In one sentence: AI hallucination is when a language model generates confident-sounding but factually false information, because it predicts plausible text rather than verified facts.
Why it matters: AI outputs can be wrong even when they sound authoritative — always verify important factual claims from AI sources.
Real example: Lawyers have submitted court briefs citing AI-generated case names that appeared legitimate but were entirely fabricated.
Related terms: RAG, RLHF, AI Alignment, LLM

Frequently Asked Questions

Why can’t AI just say I don’t know?

It can — and well-designed models do for many uncertain queries. But this requires specific training to recognize and acknowledge uncertainty. The base behavior of a language model is to continue generating plausible text, which means generating something rather than nothing. RLHF and instruction tuning teach models to express uncertainty, but they don’t perfectly override the underlying generation tendency.

Which AI model hallucinates the least?

In 2024-2025 benchmarks, Claude 3.5 Sonnet and GPT-4o consistently show lower hallucination rates than earlier models. Tools built on top of models with mandatory retrieval — like Perplexity AI — often outperform standalone chat models on factual accuracy because they cite and retrieve from real sources rather than generating from memory.

Can AI hallucinate in code?

Yes. Code hallucinations include: functions or library methods that don’t exist, plausible-looking but incorrect logic, security vulnerabilities presented as correct patterns, and outdated syntax from older versions of a framework. Always run and test AI-generated code rather than assuming it works.

Is AI hallucination getting better over time?

Yes, steadily. Each generation of frontier models shows lower hallucination rates. Techniques like RLHF, constitutional AI, and tool-augmented generation (giving models access to search/calculators) all help. But experts expect hallucination to remain a persistent challenge rather than a fully solvable problem, because it’s inherent to how probabilistic language models work.

How do I know if an AI response is a hallucination?

You often can’t tell from the response itself — that’s the challenge. The best practices: verify any specific factual claims (statistics, citations, dates, proper nouns) against primary sources. Be more skeptical about obscure facts, recent events, and claims about specific people. Use AI tools with citations (Perplexity, Copilot with Bing) when accuracy matters.

Why does AI make things up?

LLMs generate text by predicting the most statistically plausible next token, not by looking up verified facts. When a model doesn’t have reliable training signal for a topic, it can produce fluent-sounding text that is factually wrong — a phenomenon called hallucination. The model has no internal ‘fact-checking’ mechanism; it outputs whatever pattern matches the context, even if that pattern is incorrect.

How do I stop AI from hallucinating?

You can’t eliminate hallucinations entirely, but you can reduce them significantly. Grounding the model in retrieved documents (RAG) is the most effective technical approach — the model answers from real sources rather than memory. For prompting, instruct the model to say ‘I don’t know’ when uncertain, ask it to cite sources, and break complex questions into smaller verifiable steps. Always verify high-stakes factual claims independently.

Want to learn more AI concepts?

Browse our complete AI Glossary for plain-English explanations of every AI term, or get our Beginners in AI Report for free updates.

Get free AI tips delivered daily → Subscribe to Beginners in AI

Learn Our Proven AI Frameworks

Beginners in AI created 6 branded frameworks to help you master AI: STACK for prompting, BUILD for business, ADAPT for learning, THINK for decisions, CRAFT for content, and CRON for automation.

Sources

This article draws on official documentation, product pages, and industry reporting. Specific sources are linked inline throughout the text.

Last reviewed: April 2026

Get Smarter About AI Every Morning

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.

Free forever. Unsubscribe anytime.

AI Flashcards & Spaced Repetition

Image Alt Text: ChatGPT + Make

Build a Memory Palace with AI