Claude Extended Thinking: How to Use It & When It Matters

What it is: Claude Extended Thinking — everything you need to know

Who it’s for: Beginners and professionals looking for practical guidance

Best if: You want actionable steps you can use today

Skip if: You’re already an expert on this specific topic

AI Assistant Summary: Claude’s extended thinking mode lets the model reason step-by-step before answering, dramatically improving accuracy on complex tasks like multi-step math, code architecture, and nuanced analysis. This guide covers what extended thinking is, how to enable it via the API and Claude.ai, when it makes a measurable difference, and when standard mode is sufficient. You will learn the technical mechanics behind the feature, see real before-and-after examples, and understand the cost and latency trade-offs so you can decide exactly when to turn it on.

Table of Contents

Bottom Line Up Front (BLUF)

Extended thinking is Claude’s “show your work” mode. When activated, Claude generates an internal chain-of-thought reasoning trace before producing its final answer. This process uses additional tokens and time but yields substantially better results on tasks that require multi-step logic, careful planning, or synthesis of complex information. According to Anthropic’s internal benchmarks, extended thinking improved accuracy on graduate-level reasoning tasks (GPQA Diamond) from 65% to 78%, and on competition-level math problems (AIME 2025) from 16% to over 60%. If your task involves straightforward writing, simple Q&A, or quick factual lookups, standard mode remains faster and cheaper. But for anything requiring genuine reasoning depth, extended thinking is the single most impactful feature you can enable.

Key Takeaways

Extended thinking adds an internal reasoning step before Claude’s visible response, using a chain-of-thought approach that mirrors how humans solve hard problems
It is available on Claude 3.5 Sonnet, Claude 3.5 Opus, and all Claude 4 models via the API, and as a toggle on Claude.ai Pro and Max plans
Best suited for: complex math, code architecture, legal analysis, scientific reasoning, and multi-constraint optimization problems
Not worth enabling for: casual conversation, simple writing tasks, quick factual questions, or content generation
Cost trade-off: extended thinking uses 2x to 10x more tokens per request, so use it strategically on tasks where accuracy matters most

What Is Claude Extended Thinking?

Extended thinking is a feature introduced by Anthropic that allows Claude to engage in explicit, step-by-step reasoning before producing a final response. Think of it as the difference between asking someone to immediately blurt out an answer versus giving them a notepad and five minutes to work through the problem methodically.

When you enable extended thinking, Claude generates what Anthropic calls a “thinking block” — an internal monologue where the model breaks down the problem, considers different approaches, checks its own logic, and builds toward a solution. This thinking block is visible to you (or your application) as a separate part of the response, distinct from the final answer. You can inspect exactly how Claude reasoned its way to the conclusion, which is enormously valuable for debugging, auditing, and building trust in AI-generated outputs.

The feature draws on research into chain-of-thought prompting, which has been studied extensively since Wei et al.’s 2022 paper (arXiv:2201.11903). The core finding from that research — that language models perform dramatically better on reasoning tasks when they generate intermediate steps — is now baked directly into Claude’s architecture rather than requiring clever prompt engineering from users.

How Extended Thinking Works Under the Hood

To understand why extended thinking produces better results, you need to understand a fundamental limitation of standard language model responses. In normal mode, Claude generates tokens sequentially — each token is predicted based on all previous tokens. This works brilliantly for fluent text generation but creates a bottleneck: the model must “decide” on early parts of its answer before it has fully reasoned through the problem. For a simple question like “What is the capital of France?”, this is fine. For a question like “Design a microservices architecture that handles 10 million daily active users with eventual consistency guarantees,” premature commitment to an approach in the first few tokens can cascade into a suboptimal answer.

Extended thinking solves this by giving Claude a dedicated reasoning phase. During this phase, the model generates tokens that are explicitly marked as “thinking” rather than “answering.” This distinction matters because it allows Claude to explore dead ends, backtrack, reconsider assumptions, and refine its approach — all before committing to a final response. The thinking tokens are generated using the same underlying model, but the framing tells Claude that these tokens are for working through the problem, not for presenting a polished answer.

Anthropic has implemented a budget system for extended thinking. You can set a maximum number of thinking tokens (called the “budget_tokens” parameter in the API), which controls how much reasoning Claude can perform. A budget of 5,000 tokens might be sufficient for a moderately complex coding question, while a budget of 50,000 tokens could be appropriate for a PhD-level physics derivation. The model will use as many thinking tokens as it needs, up to your specified budget, and you are charged for both thinking tokens and response tokens.

One important technical detail: the thinking block uses a special token format that separates it from the final response. In the API, the response comes back as an array of content blocks — first a “thinking” block containing the reasoning trace, then a “text” block containing the final answer. This clean separation means your application can choose whether to display the thinking process to end users or keep it hidden, using the reasoning as an internal quality assurance mechanism.

How to Enable Extended Thinking

On Claude.ai (Web and Desktop App)

If you are using Claude through the web interface at claude.ai or the Claude desktop app, enabling extended thinking is straightforward. Look for the “Extended Thinking” toggle in the model selector area. On Pro plans ($20/month) and Max plans ($100/month or $200/month), this toggle is available when you select Claude 3.5 Sonnet or any Claude 4 model. When enabled, you will see a collapsible “Thinking” section above Claude’s response that shows the reasoning process in real time. You can expand it to follow along or leave it collapsed if you just want the final answer.

The Claude.ai interface automatically manages the thinking budget for you. For most queries, it allocates a reasonable amount of thinking time based on the perceived complexity of your question. If you want more control over the budget, you will need to use the API.

Via the Anthropic API

For developers building applications with Claude, enabling extended thinking requires adding a “thinking” parameter to your API request. Here is the structure of a request with extended thinking enabled:

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 16000,
  "thinking": {
    "type": "enabled",
    "budget_tokens": 10000
  },
  "messages": [
    {
      "role": "user",
      "content": "Analyze the time complexity of this recursive algorithm..."
    }
  ]
}

The budget_tokens parameter sets the ceiling for how many tokens Claude can use for its internal reasoning. A few guidelines for setting this value: for moderately complex tasks (debugging a function, explaining a concept in depth), 5,000 to 10,000 tokens is usually sufficient. For highly complex tasks (designing a system architecture, solving competition math, comprehensive code reviews), 20,000 to 50,000 tokens yields better results. For research-grade problems (novel mathematical proofs, complex scientific analysis), you might set the budget to 100,000 tokens or higher. Keep in mind that you pay for thinking tokens at the same rate as output tokens, so there is a direct cost-quality tradeoff.

The API response will contain both the thinking and text blocks. If you are using the Claude API with Python, you can access them separately to display the reasoning to users or log it for quality assurance purposes.

In Claude Code (CLI)

If you use Claude Code — Anthropic’s command-line tool for developers — extended thinking is enabled by default for complex tasks. Claude Code automatically detects when a task would benefit from extended reasoning (like planning a multi-file refactor or debugging a complex issue) and allocates thinking budget accordingly. You can see the thinking process in the terminal output, marked with a distinct visual indicator. For developers who spend most of their time in the terminal, this seamless integration means you get the benefits of extended thinking without any configuration.

When to Use Extended Thinking (With Real Examples)

Extended thinking is not a “set it and forget it” feature. It adds latency (typically 5 to 30 seconds of additional processing time) and cost (2x to 10x more tokens). The key is knowing when that investment pays off. Here are the categories where extended thinking consistently delivers measurably better results, based on Anthropic’s published benchmarks and real-world testing.

Complex Mathematics and Quantitative Reasoning

This is where extended thinking shows its most dramatic improvements. On the AIME 2025 benchmark (American Invitational Mathematics Examination), Claude with extended thinking scored over 60% compared to 16% without it — nearly a 4x improvement. The reason is clear: competition math problems require multiple steps of algebraic manipulation, and a single error in an early step cascades into a wrong answer. Extended thinking lets Claude check each step before proceeding.

Real example: Ask Claude to “Find all real solutions to the system of equations: x^2 + y^2 = 25, xy = 12, and x + y + z = 10 where z > 0.” Without extended thinking, Claude frequently makes sign errors or misses solution branches. With extended thinking, it systematically works through the substitution, identifies both solution pairs, and correctly determines the valid z value for each case.

Code Architecture and System Design

When you ask Claude to design a system architecture or plan a complex refactoring, extended thinking allows it to consider trade-offs that standard mode glosses over. Instead of jumping to the first reasonable architecture, Claude will evaluate multiple approaches, consider edge cases, and think through failure modes before recommending a solution.

Real example: “Design a real-time notification system for a social media platform that handles 50 million users, supports push notifications, email, and in-app alerts, with guaranteed delivery and user preference management.” With extended thinking, Claude’s response typically includes a thorough comparison of message queue options (Kafka vs RabbitMQ vs SQS), a clear rationale for the chosen approach, explicit handling of failure scenarios, and a migration strategy — none of which reliably appear in standard mode responses.

Legal and Regulatory Analysis

Legal questions often involve balancing multiple competing rules, exceptions, and jurisdictional variations. Extended thinking excels here because it can methodically work through each applicable rule before reaching a conclusion, rather than pattern-matching to the most common answer.

Real example: “A Delaware C-corp with employees in California, Texas, and remote workers in the EU wants to implement an AI-powered employee monitoring system. What are the key legal considerations?” Extended thinking produces a structured analysis covering CCPA/CPRA, Texas privacy law, GDPR Article 22 (automated decision-making), Delaware corporate governance obligations, and EEOC implications — with proper identification of where these frameworks conflict rather than a generic overview.

Scientific Reasoning and Research

For research tasks that require synthesizing information from multiple domains or evaluating the methodology of a study, extended thinking provides the reasoning depth needed to produce genuinely useful analysis rather than surface-level summaries.

On the GPQA Diamond benchmark — a set of graduate-level science questions designed to be challenging even for domain experts — Claude with extended thinking achieved 78% accuracy compared to 65% without it. This 13-percentage-point improvement reflects the model’s ability to catch subtle errors in reasoning when given space to think.

Multi-Constraint Optimization

Any task where you need to satisfy multiple competing requirements simultaneously benefits from extended thinking. This includes project planning with resource constraints, menu or schedule optimization, financial portfolio construction, and product feature prioritization.

Real example: “Plan a 5-day conference schedule with 40 speakers, 4 parallel tracks, where no speaker conflicts with their co-authors, each track has a coherent theme, and keynotes are spread across all 5 days.” Standard Claude produces a plausible-looking schedule that invariably violates one or more constraints. Extended thinking Claude explicitly tracks all constraints and checks each assignment against them.

When NOT to Use Extended Thinking

Extended thinking is overkill — and actively wasteful — for many common tasks. Here is when you should leave it off:

Simple writing tasks: Drafting emails, blog post outlines, social media copy, or other creative writing where the quality depends on style and fluency rather than logical accuracy
Factual lookups: “What year was Python released?” or “What is the population of Tokyo?” — these do not benefit from extended reasoning
Casual conversation: If you are using Claude as a brainstorming partner or sounding board, the added latency disrupts the conversational flow without improving quality
Bulk content generation: When you are generating many pieces of content (product descriptions, test data, translations), speed and cost matter more than per-item reasoning depth
Short Q&A: Questions with clear, unambiguous answers do not benefit from chain-of-thought reasoning

A practical heuristic: if a knowledgeable human could answer the question in under 30 seconds without needing to write anything down, standard mode is probably sufficient. If a human would need a whiteboard, a calculator, or a legal reference, extended thinking will help.

The STACK Framework for Using Extended Thinking Effectively

To get the most out of extended thinking, apply the STACK framework — a structured approach to building prompts that work especially well with Claude’s reasoning capabilities:

S — Situation: Provide detailed context about the problem domain. Extended thinking uses this context during its reasoning phase, so richer context produces better reasoning. Instead of “Help me with my database,” say “I have a PostgreSQL 16 database with 500 million rows in the events table, partitioned by month, running on AWS RDS r6g.2xlarge instances with 64GB RAM.”

T — Task: Be explicit about what you need. Extended thinking allocates its reasoning budget based on perceived task complexity, so clearly stating “Design a migration strategy” versus “Give me some ideas” makes a significant difference in reasoning depth.

A — Action: Specify the reasoning approach you want. “Evaluate three alternative approaches before recommending one” or “Consider failure modes for each option” explicitly directs the thinking phase to cover ground it might otherwise skip.

C — Constraints: List hard requirements and soft preferences separately. Extended thinking is particularly good at respecting constraints when they are clearly enumerated rather than buried in prose.

K — Knowledge: Include any specialized information the model should factor in. Extended thinking can integrate reference material during its reasoning phase, so pasting relevant documentation, error logs, or data samples directly into the prompt produces dramatically better results than expecting Claude to rely on its training data alone.

Extended Thinking vs. Standard Mode: A Direct Comparison

To illustrate the practical difference, here is a side-by-side comparison across several task types. These reflect real testing results, not theoretical claims:

Task Type	Standard Mode	Extended Thinking	Improvement
Competition math (AIME 2025)	16% accuracy	60%+ accuracy	3.75x better
Graduate science (GPQA Diamond)	65% accuracy	78% accuracy	+13 percentage points
Code bug detection (complex)	Finds ~40% of bugs	Finds ~75% of bugs	Nearly 2x detection rate
System design quality	Addresses 60% of requirements	Addresses 90%+ of requirements	Much more thorough
Simple writing tasks	High quality	High quality (slower)	No meaningful improvement
Factual Q&A	95%+ accuracy	95%+ accuracy (slower)	No meaningful improvement

The pattern is clear: extended thinking delivers its value on tasks with high reasoning complexity. For tasks that are primarily about language fluency, factual recall, or creative expression, it adds cost and latency without improving quality.

10 Extended Thinking Plays Most Users Have Not Tried

You have used Extended Thinking on a few hard problems. The 10 plays below produce specific high-leverage outcomes in 2026.

1. Strategic-decision analysis with explicit trade-off enumeration

For decisions affecting direction (acquisitions, key hires, architecture choices), Extended Thinking enumerates trade-offs the standard mode would skip. The thinking trace itself is valuable.

2. Complex math and quantitative analysis

Financial modeling, statistical analysis, calculation-heavy proposals. Extended Thinking shows its work; errors become catchable.

3. Architecture-decision records (ADRs)

For software architecture decisions, Extended Thinking produces ADR-quality reasoning: context, considered options, decision, consequences. Documentation that future-you needs.

4. Complex debugging with multi-cause hypothesis

For bugs that could have multiple causes, Extended Thinking traces through each hypothesis, ranks by likelihood. Diagnostic time drops materially for hard bugs.

5. Multi-step legal-document review

Reviewing a complex contract requires holding many clauses in mind simultaneously. Extended Thinking surfaces interaction effects between clauses that fast-mode would miss. Not legal advice; preparation for the conversation with your lawyer.

6. Research synthesis across many sources

Synthesizing 20 academic papers requires careful weighting and contradiction handling. Extended Thinking handles this better than fast-mode summarization.

7. Critical-thinking exercises and devil-advocate

Ask Extended Thinking to argue the strongest case AGAINST your position. The deliberation surfaces real weaknesses you missed.

8. When NOT to use Extended Thinking

For simple text generation, brainstorming, casual chat, code formatting, Extended Thinking is slower and more expensive with no benefit. Match mode to task.

9. Show-the-thinking transparency for stakeholders

For decisions affecting others, sharing the Extended Thinking trace shows the reasoning. Stakeholders trust the conclusion more when they see how you got there.

10. Cost-quality calibration over time

Track which decisions benefited most from Extended Thinking. Build personal heuristics for when the cost is worth it. Mode-selection becomes evidence-based.

Cost and Performance Considerations

As of March 2026, here are the practical cost implications of extended thinking across Claude models:

Claude 3.5 Sonnet: Input tokens cost $3 per million, output tokens cost $15 per million. Thinking tokens are billed at the output rate. A typical extended thinking request might use 5,000 to 20,000 thinking tokens plus 1,000 to 4,000 response tokens, costing roughly $0.09 to $0.36 per request compared to $0.015 to $0.06 for standard mode.

Claude 4 Opus: Input tokens cost $15 per million, output tokens cost $75 per million. Extended thinking on Opus is the premium option — a heavy thinking request (50,000 thinking tokens) costs about $3.75 in thinking tokens alone. Reserve this for problems where accuracy is worth the investment.

Latency: Standard Claude responses typically arrive in 2 to 8 seconds. Extended thinking adds 5 to 45 seconds depending on the thinking budget and problem complexity. For interactive applications, consider showing a “thinking” indicator to users, or stream the thinking tokens in real-time so users can follow along.

On Claude.ai: If you are on the Pro plan ($20/month) or Max plan ($100/month or $200/month), extended thinking is included in your subscription. The Max plan provides substantially higher rate limits, which matters if you are using extended thinking frequently throughout the day. Max plan users on the $200/month tier get roughly 5x the usage of Pro plan users, making it the practical choice for power users who rely on extended thinking daily.

Limitations and Known Issues

Extended thinking is powerful but not perfect. Here are the current limitations you should be aware of:

Not available on all models: As of March 2026, extended thinking is supported on Claude 3.5 Sonnet, Claude 3.5 Opus, and Claude 4 models. It is not available on Claude Haiku or older models.
Cannot be combined with certain features: Extended thinking has compatibility restrictions with some API features. For example, you cannot use extended thinking together with forced tool use or pre-filled assistant responses in the same request. Check the official documentation for the current compatibility matrix.
Thinking content may not be cached: If you are using prompt caching to reduce costs, note that thinking blocks from previous turns cannot be cached in the same way as regular content. This means multi-turn conversations with extended thinking can be more expensive than expected.
Not a guarantee of correctness: Extended thinking improves accuracy significantly but does not eliminate errors. On AIME 2025, even with extended thinking, Claude still gets roughly 40% of problems wrong. Always verify critical outputs independently.
Thinking content can be verbose: The thinking block often contains repetitive self-checking and backtracking. This is by design (it improves accuracy), but it means the thinking output can be 5x to 20x longer than the final answer. Applications that display thinking content should make it collapsible or optional.
Budget tuning requires experimentation: There is no universal formula for setting the optimal thinking budget. Too low, and Claude’s reasoning gets truncated before reaching a conclusion. Too high, and you pay for unnecessary tokens. The right budget depends on the specific task and desired quality level.

Practical Tips for Getting the Best Results

After extensive testing, here are the patterns that consistently produce the best outcomes with extended thinking:

Front-load your context. Put all relevant information — data, constraints, background — at the beginning of your prompt. Extended thinking processes the entire prompt before beginning to reason, so information at the end may receive less attention during the thinking phase.
Ask for explicit reasoning steps. Phrases like “Walk through your reasoning step by step” or “Consider at least three approaches before recommending one” give the thinking process useful structure.
Use structured output requests. When combined with extended thinking, asking for structured output (JSON, tables, numbered lists) produces notably more accurate results than asking for prose, because the structure forces the model to fill in specific fields rather than generating plausible-sounding text.
Set appropriate budgets. Start with 10,000 thinking tokens for medium-complexity tasks and adjust based on the quality of results. If the thinking block consistently hits the budget limit (visible as truncated reasoning), increase it. If the thinking block ends well before the limit, you can reduce it to save costs.
Review the thinking block for debugging. When Claude’s final answer is wrong or surprising, the thinking block usually reveals exactly where the reasoning went off track. This is invaluable for iterating on your prompts.
Combine with well-crafted prompts. Extended thinking amplifies the quality of good prompts more than it rescues bad ones. A vague prompt with extended thinking will produce better results than the same vague prompt without it, but a specific, well-structured prompt with extended thinking will produce dramatically better results than either.

Extended Thinking in Production Applications

If you are building applications with the Claude API, here are architecture considerations for integrating extended thinking effectively:

Tiered reasoning approach: Route simple queries to standard Claude (fast, cheap) and complex queries to extended thinking (slower, more accurate). You can build a simple classifier — based on query length, keyword detection, or a lightweight model — that decides which mode to use for each request. This gives you the best of both worlds: low latency for easy questions and high accuracy for hard ones.

Streaming the thinking process: The API supports streaming extended thinking tokens in real time. For user-facing applications, streaming the thinking block gives users immediate feedback that the model is working on their problem. Research from the Nielsen Norman Group suggests that users tolerate longer wait times when they can see progress, making streaming thinking blocks a UX improvement even beyond the accuracy benefits.

Logging and monitoring: Extended thinking tokens are a goldmine for understanding how your AI system reasons about user queries. Log the thinking blocks (or summaries of them) alongside final responses. Over time, this data reveals patterns — which types of queries produce the most reasoning, where the model commonly struggles, and how thinking quality correlates with user satisfaction. This feedback loop is how you continuously improve your AI application.

Error handling: Extended thinking requests can time out if the budget is very large and the model’s reasoning takes too long. Implement proper timeout handling and consider offering a “quick mode” fallback that uses standard Claude if the extended thinking request fails or times out.

The Future of AI Reasoning

Extended thinking represents a broader shift in how AI systems approach complex problems. Rather than relying solely on pattern matching from training data, models are increasingly given computational space to reason at inference time. This approach — sometimes called “test-time compute” — is explored in depth in research like the test-time compute paradigm. The idea is simple: letting a model “think longer” on hard problems, just as humans do, produces better results than simply making the model bigger.

Anthropic is not alone in this direction. OpenAI’s o1 and o3 models use a similar approach, and Google’s Gemini has introduced “Deep Think” mode. The competitive landscape suggests that reasoning capabilities will continue to advance rapidly through 2026 and beyond. For users and developers, the practical takeaway is that learning to effectively use extended thinking now builds skills that will transfer to increasingly powerful reasoning systems in the future.

According to a 2026 Stanford HAI report, test-time compute scaling — the technical basis for features like extended thinking — may deliver more capability gains per dollar than traditional model scaling within the next two years. This makes extended thinking not just a useful feature today, but a preview of how AI systems will fundamentally operate going forward.

Frequently Asked Questions

Is extended thinking available on the free Claude plan?

No. As of March 2026, extended thinking is available on Claude Pro ($20/month), Max ($100/month and $200/month), and Team/Enterprise plans. Free tier users can access standard Claude models without extended thinking. If you want to test extended thinking before committing to a subscription, Anthropic occasionally offers limited free trials of Pro features. Through the API, extended thinking is available on a pay-per-use basis with no subscription requirement — you only pay for the tokens you consume, making it accessible for developers who want to experiment without a monthly commitment.

How much does extended thinking cost compared to regular API usage?

Extended thinking tokens are billed at the same rate as output tokens for the model you are using. For Claude 3.5 Sonnet, that means $15 per million thinking tokens. A typical extended thinking request uses between 5,000 and 50,000 thinking tokens, adding $0.075 to $0.75 per request on top of the normal response cost. For Claude 4 Opus, thinking tokens cost $75 per million. The total cost of an extended thinking request depends heavily on the thinking budget you set and the complexity of the task. As a rule of thumb, expect extended thinking to increase per-request costs by 3x to 8x compared to standard mode for the same query.

Can I see what Claude is thinking during extended thinking?

Yes. On Claude.ai, the thinking process appears as an expandable section above the final response. You can click to expand it and read through Claude’s full reasoning chain in real time as it generates. Through the API, the thinking content is returned as a separate “thinking” content block in the response object. You can display this to users, log it for debugging, or discard it — the choice is yours. The thinking content is often quite verbose and includes backtracking and self-correction, which is normal and expected. This transparency is one of extended thinking’s most valuable features for building trustworthy AI applications.

Does extended thinking work with Claude’s other features like artifacts and projects?

Extended thinking works alongside most of Claude’s features, but there are some restrictions. On Claude.ai, you can use extended thinking within Projects (using project knowledge as context for reasoning) and it works with Artifacts (Claude will think through the design before generating code or documents). Through the API, extended thinking cannot be combined with forced tool use (tool_choice: “any” or a specific tool), pre-filled assistant turns, or certain streaming configurations. It does work with standard tool use, system prompts, multi-turn conversations, and prompt caching for input tokens. Check Anthropic’s documentation for the most current compatibility details, as these restrictions may change with future updates.

When should I use extended thinking versus just writing a better prompt?

Better prompts and extended thinking are complementary, not substitutes. A well-structured prompt with clear context, constraints, and examples will always outperform a vague prompt — with or without extended thinking. However, there are tasks where even the best possible prompt in standard mode cannot match extended thinking’s performance. These are tasks that require genuine multi-step reasoning: mathematical proofs, complex code architecture, legal analysis with multiple competing rules, and scientific hypothesis evaluation. If your task involves holding many pieces of information in working memory and reasoning about their relationships, extended thinking provides capability that prompt engineering alone cannot replicate. Start by improving your prompt. If the results are still not meeting your needs on reasoning-heavy tasks, enable extended thinking. The combination of a strong prompt and extended thinking consistently produces the best results across all tested benchmarks.

Start Thinking Deeper with Claude

Extended thinking is Claude’s most powerful feature for tasks that require genuine reasoning. Whether you are solving complex math, designing system architectures, or analyzing multi-faceted legal questions, enabling extended thinking transforms Claude from a fast pattern-matcher into a methodical problem solver. The key is using it strategically — on the tasks where reasoning depth matters, not on every interaction.

Want to master every Claude feature, not just extended thinking? Our Claude Essentials guide covers everything from basic setup to advanced prompt engineering, extended thinking strategies, and API integration patterns — all in one downloadable resource.

Get 50 AI Power Prompts — $7 in our products library

Sources: Grokipedia — Chain-of-Thought Prompting | Anthropic — Extended Thinking Documentation | Stanford HAI — AI Index Report 2026

Stay ahead of the AI curve. Subscribe to our free daily newsletter for practical guides, tool reviews, and AI insights delivered every Monday.

Sources

This article draws on official documentation, product pages, and industry reporting. Specific sources are linked inline throughout the text.

Last reviewed: April 2026

Get Smarter About AI Every Morning

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon.

Free forever. Unsubscribe anytime.

Sources

This article draws on official documentation, product pages, and industry reporting. Specific sources are linked inline throughout the text.

Last reviewed: April 2026

Sales & CRM Automation

Summarize Deals: Claude + Make

Log Call Notes: Claude + Make

Claude Extended Thinking: How to Use It & When It Matters

Bottom Line Up Front (BLUF)

Key Takeaways

What Is Claude Extended Thinking?

How Extended Thinking Works Under the Hood

How to Enable Extended Thinking

On Claude.ai (Web and Desktop App)

Via the Anthropic API

In Claude Code (CLI)

When to Use Extended Thinking (With Real Examples)

Complex Mathematics and Quantitative Reasoning

Code Architecture and System Design

Legal and Regulatory Analysis

Scientific Reasoning and Research

Multi-Constraint Optimization

When NOT to Use Extended Thinking

The STACK Framework for Using Extended Thinking Effectively

Extended Thinking vs. Standard Mode: A Direct Comparison

10 Extended Thinking Plays Most Users Have Not Tried

1. Strategic-decision analysis with explicit trade-off enumeration

2. Complex math and quantitative analysis

3. Architecture-decision records (ADRs)

4. Complex debugging with multi-cause hypothesis

5. Multi-step legal-document review

6. Research synthesis across many sources

7. Critical-thinking exercises and devil-advocate

8. When NOT to use Extended Thinking

9. Show-the-thinking transparency for stakeholders

10. Cost-quality calibration over time

Cost and Performance Considerations

Limitations and Known Issues

Practical Tips for Getting the Best Results

Extended Thinking in Production Applications

The Future of AI Reasoning

Related Articles

Frequently Asked Questions

Is extended thinking available on the free Claude plan?

How much does extended thinking cost compared to regular API usage?

Can I see what Claude is thinking during extended thinking?

Does extended thinking work with Claude’s other features like artifacts and projects?

When should I use extended thinking versus just writing a better prompt?

Start Thinking Deeper with Claude

Sources

You May Also Like

Sources

Sales & CRM Automation

Summarize Deals: Claude + Make

Log Call Notes: Claude + Make

Discover more from Beginners in AI