,

Claude Memory and Context: How Claude Remembers (and Forgets)

Anthropic Claude AI assistant

What it is: The 2026 guide to Claude’s memory and context window — what 1M tokens actually unlocks, the context-rot tradeoffs you need to know, how CLAUDE.md memory files work in Claude Code, how Claude’s context compares to ChatGPT and Gemini, and 10 long-context plays most users have not tried.
Who it is for: Anyone working with long documents, large codebases, or research-heavy workflows in Claude.
Best if: You want a complete, current view of what to do with long context — and where it breaks.
Skip if: You only use Claude for short chat — this is for long-context power-users. Daily AI updates in our free newsletter.

On March 13, 2026, Anthropic made Claude’s 1 million token context window generally available for Claude Opus 4.6 and Claude Sonnet 4.6 — and made it free for most users. That’s roughly 750,000 words per request (75 non-fiction books, or War and Peace plus the entire Harry Potter series). For anyone who works with long documents, research papers, contracts, or codebases, this is one of the biggest capability unlocks of 2026.

Anthropic Claude AI assistant
Claude AI

This guide covers what Claude’s context window actually is, how to use it well, when it matters, and the quieter “memory” features Claude has added alongside it.

What does context window mean in Claude?

The context window is the maximum amount of text Claude can consider in a single interaction — your prompt, any documents you paste, plus Claude’s response. Bigger context means Claude can reason across more material at once without losing track.

Think of it like short-term memory. A human reading a 100-page document holds some things in mind while they read. An AI with a 1M context window holds 7,500 pages worth at once.

What changed in Claude memory and context in March 2026?

Before: The 1M context was available but carried a surcharge. Most users had effective access to 200K tokens (still huge — about 150,000 words).

After: 1M context is free for everyone with access. The surcharge is gone. A 900K-token request now costs the same per-token rate as a 9K one. Per Anthropic’s context window docs, this is available on Claude Opus 4.6 and Claude Sonnet 4.6 via the API.

Max, Team, and Enterprise users get automatic 1M window access with Opus 4.6. Pro users need to opt in by typing /extra-usage in Claude Code.

What can you actually do with 1M tokens in Claude?

Analyze entire books

Upload a 500-page PDF and ask for a chapter-by-chapter summary. Or find contradictions between chapter 3 and chapter 17. Or extract every quote from a character. All in one request.

Full-codebase reasoning

For developers, Claude can now hold a medium-sized codebase entirely in context and reason about cross-file effects. “How would changing this function break the rest of the codebase?” becomes answerable with real accuracy. Claude Code makes this even easier.

Long research workflows

Paste 20 research papers and synthesize findings. Drop in a year of earnings transcripts and find patterns. Feed Claude a full congressional hearing transcript and get questions-and-answers extracted with speaker attribution.

Long agent tasks

For anyone building AI agents (via the Claude Agent SDK), the bigger context means agents can accumulate more context as they work through multi-step tasks without losing track.

Stay Current on Claude Updates

Join our free newsletter for practical AI tutorials, tool updates, and business strategies — written for beginners, useful for everyone.

Subscribe Free

What are the limits of long context (context rot)?

Bigger context isn’t free accuracy. As you fill more of the window, performance on specific recall tasks degrades — a phenomenon called “context rot.” Opus 4.6 scores 78.3% on MRCR v2 at 1M tokens, which is the highest among frontier models at that length but still shows the degradation pattern.

Practical rules:

  • For shorter, specific questions, keep the context tight. Don’t dump everything in “just in case.”
  • For broad synthesis, fill the context — Claude’s nuance stays strong even at length.
  • For retrieval (“find this specific fact in this huge document”), consider splitting into smaller chunks for better accuracy.

How does Claude Code memory work (CLAUDE.md files)?

Separate from the context window, Claude Code has a memory feature via CLAUDE.md files. You create a CLAUDE.md file in your project root with persistent instructions: coding standards, architecture decisions, preferred libraries, review checklists. Claude Code reads it at the start of every session. According to Anthropic’s memory documentation, Claude also builds “auto memory” as it works, saving useful learnings (build commands, debugging insights) across sessions without you writing anything.

This is different from the context window. CLAUDE.md is for stable long-term instructions; the context window is for the stuff you’re working on right now. Both are useful.

How does Claude’s context window compare to ChatGPT and Gemini?

  • Claude Opus 4.6 / Sonnet 4.6: 1M tokens, now free
  • Google Gemini 3 Pro: 1M+ tokens
  • xAI Grok 4 Fast: 2M tokens
  • GPT-5.4 and variants: Roughly 400K tokens in practice
  • Meta Llama 4 Scout: 10M tokens (industry-leading, open-source)

Raw window size isn’t the whole story. Accuracy at length varies significantly. Claude’s nuance-preservation at long context is its specific advantage — other models may handle size but lose quality faster.

What are 10 long-context plays most users have not tried?

  • Drop in a whole codebase for refactoring analysis. 1M token context fits most mid-size codebases. Refactoring suggestions get materially better with full context.
  • Year of meeting notes for retrospective analysis. Paste 12 months of meeting notes; ask for patterns. Strategic insights surface that month-by-month review misses.
  • Legal document corpus for clause comparison. Multiple contracts side-by-side; AI flags inconsistencies, missing clauses, anomalies.
  • Research paper plus all citations in one window. Read a paper with its references all in context; comprehension improves.
  • Full transcript analysis of long interviews. 5-hour interview transcripts; surface patterns, insights, follow-up questions.
  • Codebase plus issue tracker for bug investigation. Bugs hide in interactions between code and historical issues. Both in context surfaces those.
  • Multiple support tickets for trend analysis. Pattern recognition across hundreds of tickets becomes tractable.
  • The CLAUDE.md project memory pattern. Project context file plus 1M token window equals Claude that knows your project specifics.
  • When NOT to use 1M context. For simple queries, smaller context is faster and cheaper. Match context window to task size.
  • Context rot is real even with 1M tokens. Lost-in-the-middle effect persists. Structure your input; do not just dump and hope.

What are common mistakes with long Claude context?

  • Dumping everything “just in case.” Bigger context isn’t free — it slows responses and can degrade accuracy. Include only relevant documents.
  • Expecting perfect recall. Context rot is real. Claude at 1M tokens is still excellent (78.3% on MRCR v2) but not perfect. Verify critical facts.
  • Ignoring CLAUDE.md. For recurring work in Claude Code, persistent instructions in a memory file beat re-pasting context every session.
  • Not splitting huge documents. For documents over 1M tokens, summarize chunks first, then synthesize. Better than truncating.
  • Forgetting the competition. Llama 4 Scout has 10M context open-source. For certain workflows, that’s the right tool — not Claude.

When should you use 1M context vs. a smaller context window?

  • Use 1M context for: Whole-book analysis, full codebase reasoning, multi-document synthesis, long agent tasks.
  • Use smaller context for: Quick answers, chat, casual tasks where speed matters.
  • Use CLAUDE.md for: Persistent instructions, coding standards, style guides, preferences that apply to every session.

Frequently Asked Questions

How do I use the 1M context?

Via claude.ai or the API. Just paste your long document. On Claude.ai you may need to use a file upload if the paste exceeds the text-input limit. For API usage, see Anthropic’s context docs.

Do I pay extra for long contexts?

No, the long-context surcharge was removed in March 2026. You pay the same per-token rate whether your context is 1K or 1M.

Which plan do I need?

Max, Team, and Enterprise get automatic 1M access. Pro users can opt in. Free tier has more limited context (around 200K). See our complete Claude AI guide for plan details.

How do I handle documents larger than 1M?

Split them. Summarize each chunk first, then feed the summaries back together for final synthesis. For code, Llama 4 Scout’s 10M window handles larger single-request workflows.

What’s the best use case for a huge context?

Legal contracts (analyze entire deals in one request), research (synthesize across many papers), compliance work (review entire policies), and long-running AI agents. For everyday chat, you rarely need more than 10K tokens.

What’s the bottom line on Claude’s memory and context?

Claude’s 1M context window — now free and generally available — is one of 2026’s most practically useful AI upgrades. If your work involves long documents, it’s worth trying Claude just for this capability. Combined with the CLAUDE.md memory system in Claude Code, you get both long-term persistent instructions and massive short-term working memory.

For more places AI could compress your workflow, install the free 44% Rule plugin — based on Harvard research, it finds AI opportunities most people miss.

Get Smarter About AI Every Morning

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.

Free forever. Unsubscribe anytime.

You May Also Like

Discover more from Beginners in AI

Subscribe now to keep reading and get access to the full archive.

Continue reading