Claude Opus vs Sonnet vs Haiku: Which to Use in 2026

Updated June 2026: The current Claude lineup is Opus 4.8 (most capable), Claude Sonnet 5 (the new default, near-Opus quality for less), and Haiku 4.5 (fastest and cheapest). Sonnet 5 has replaced Sonnet 4.6 as the default in the free and Pro apps. See our Claude Sonnet 5 guide for the latest, then read on for the full comparison.

AI Assistant Summary: Claude offers three current models in 2026: Opus 4.7 ($5 input / $25 output per million tokens) for the hardest reasoning and agentic coding, Sonnet 4.6 ($3 / $15) for the best balance of speed and intelligence, and Haiku 4.5 ($1 / $5) for the fastest responses at the lowest cost. Opus 4.7 and Sonnet 4.6 both offer 1M-token context windows; Haiku 4.5 offers 200K. Sonnet 4.6 is the right default for most production work. Pick Opus when reasoning quality matters more than cost. Pick Haiku when latency or volume dominates. All pricing verified against Anthropic’s official model documentation, May 2026.

Anthropic’s Claude family is structured as three tiers — named after literary forms in increasing complexity: Haiku (fast and short), Sonnet (the balanced middle), and Opus (the most expressive). As of May 2026, the current generation is Opus 4.7, Sonnet 4.6, and Haiku 4.5. This guide compares them head-to-head: what each is best at, what they cost, when to switch tiers, and which one most people should pick for a given job.

Table of Contents

The 30-second answer

Building a product or doing daily work in Claude.ai? Sonnet 4.6 is the default. Best price-to-performance for almost every task.
Hardest reasoning problems, agentic coding, multi-hour autonomous workflows? Opus 4.7. Reserve it for jobs where Sonnet hits a wall.
High-volume classification, light coding, fast chat, mobile use? Haiku 4.5. Cheapest and fastest with surprisingly strong intelligence.
Just want to try Claude in your browser for free? Pick none of these directly — sign up at claude.ai (the consumer app routes you automatically based on your Claude.ai plan).

Side-by-side comparison (May 2026)

Feature	Opus 4.7	Sonnet 4.6	Haiku 4.5
Best for	Hardest reasoning, agentic coding	Balanced default for most work	Fast, cheap, high-volume tasks
Input price	$5 / million tokens	$3 / million tokens	$1 / million tokens
Output price	$25 / million tokens	$15 / million tokens	$5 / million tokens
Context window	1M tokens (~555K words)	1M tokens (~750K words)	200K tokens (~150K words)
Max output	128K tokens	64K tokens	64K tokens
Extended thinking	No (uses adaptive)	Yes	Yes
Adaptive thinking	Yes	Yes	No
Knowledge cutoff	Jan 2026	Aug 2025	Feb 2025
API model ID	`claude-opus-4-7`	`claude-sonnet-4-6`	`claude-haiku-4-5`
Latency	Moderate	Fast	Fastest

Numbers verified May 2026 against Anthropic’s official model documentation. Pricing applies to direct Claude API usage; Amazon Bedrock and Google Vertex AI typically charge identical rates with their own surcharges. All three models support text and image input, multilingual output, and vision.

1-on-1 Coaching

Claude AI Crash Course

1-hour private video session with James. Walk through which Claude model to pick for your specific work, how to switch tiers cleanly, prompt patterns that work across all three, and the Claude.ai plan that fits. No technical background required.

$75

1-hour live

Book session →

Group Format

AI Workshops for Teams

Team-format workshops for businesses rolling Claude out to staff. We cover when each model is right for which team task. Best for businesses with 3+ people. Custom-built around your team’s actual tools.

Custom

pricing

Get a quote →

What is Claude Opus 4.7?

Opus 4.7 is Anthropic’s flagship as of May 2026. It is the model to pick when reasoning quality and agentic-task reliability matter more than cost or latency. Compared to its predecessor Opus 4.6, Anthropic describes it as “a step-change improvement in agentic coding,” particularly on multi-step autonomous workflows.

Concretely, Opus 4.7 excels at:

Long-running coding tasks where the model must plan, debug, and iterate over hours, not minutes. Code-agent benchmarks show meaningful gains over Sonnet on complex repositories.
Research synthesis across hundreds of pages of source material — the 1M-token context plus the deepest reasoning makes Opus the choice for literature reviews, due diligence, and policy analysis.
Multi-step planning where a single bad inference cascades into wasted compute. Opus pays for itself when wrong-step recovery would cost more than its premium price.
Difficult prose — legal drafting, technical specifications, sensitive writing. Opus catches nuance that Sonnet sometimes misses.
Mathematical and logical reasoning where the model needs to construct a proof or carefully verify a chain.

Where Opus is wrong: most chatbot tasks, summarization, classification, simple Q&A. You can do these on Opus — but you’ll pay 5x more than Haiku for an answer that isn’t 5x better. The premium is justified by harder tasks, not all tasks.

Opus 4.7 specs at a glance

Context window: 1,000,000 tokens (~555K words with Opus 4.7’s new tokenizer)
Max output: 128,000 tokens
Input: $5 per million tokens
Output: $25 per million tokens
Adaptive thinking: yes (model decides when to think harder)
Knowledge cutoff: January 2026
API ID: claude-opus-4-7

What is Claude Sonnet 4.6?

Sonnet 4.6 is the default model for most builders and most production workloads in 2026. Anthropic describes it as “the best combination of speed and intelligence,” and the price-to-performance ratio backs that up: Sonnet costs 40% of Opus on input and 60% on output, with comparable quality on the great majority of tasks.

Sonnet 4.6 excels at:

Production chat applications — customer support bots, internal Q&A, document assistants. Sonnet handles the volume without the Opus tax.
Code generation for everyday programming. Most engineering teams should reach for Sonnet first.
Document analysis and summarization with extended thinking turned on for complex sources.
Long-context retrieval — the 1M-token window lets Sonnet ingest entire codebases, books, or legal corpora.
Function calling / tool use — Sonnet’s tool-use reliability is at parity with Opus for most integrations.

Sonnet 4.6 also supports extended thinking, which Opus 4.7 does not (Opus uses adaptive thinking instead). For tasks where you specifically want explicit step-by-step reasoning that the model exposes, Sonnet is actually the better pick.

Sonnet 4.6 specs at a glance

Context window: 1,000,000 tokens (~750K words)
Max output: 64,000 tokens
Input: $3 per million tokens
Output: $15 per million tokens
Extended thinking: yes
Knowledge cutoff: August 2025
API ID: claude-sonnet-4-6

What is Claude Haiku 4.5?

Haiku 4.5 is the speed-and-cost-optimized model in the family. Anthropic positions it as “the fastest model with near-frontier intelligence,” which is the operative phrase: Haiku 4.5 is far smarter than what you historically expected from a small fast model.

Haiku 4.5 excels at:

High-volume classification — tagging support tickets, content moderation, intent detection, sentiment analysis. The $1/MTok input cost makes it economical at scale.
Real-time chat where latency matters more than depth. Mobile assistants, voice interfaces, customer-facing bots.
Embedded use cases — running inside other tools (autocomplete, smart suggest, transcript clean-up) where you need fast, predictable, cheap inference.
RAG retrieval & rerank — Haiku is fast enough to be the LLM in front of a vector database, summarizing retrieved chunks before passing to a smarter model.
Light coding — autocomplete, function suggestions, regex generation. Save Opus and Sonnet for harder coding work.

Where Haiku 4.5 is wrong: complex multi-step reasoning, long agentic workflows, anything requiring careful chain-of-thought across many steps. Haiku supports extended thinking, so you can push it on harder tasks, but the model’s ceiling is meaningfully below Sonnet’s.

Haiku 4.5 specs at a glance

Context window: 200,000 tokens (~150K words)
Max output: 64,000 tokens
Input: $1 per million tokens
Output: $5 per million tokens
Extended thinking: yes
Knowledge cutoff: February 2025
API ID: claude-haiku-4-5

Which Claude model should I pick? (by use case)

Specific scenarios with the right default model:

If you’re a solo developer building a side project

Start with Sonnet 4.6. The price-to-quality ratio is right for prototypes and small production traffic. Swap individual workflows up to Opus if and when Sonnet hits a wall, or down to Haiku for high-volume background tasks.

If you’re a small business owner using Claude.ai

You don’t directly pick the API model. Claude.ai’s Pro plan ($17-20/mo) routes you to whichever model fits the task, leaning Sonnet for most chat and Opus for harder requests. The Max plan unlocks more usage and longer outputs.

If you’re building a customer-facing chatbot

Haiku 4.5 for the fast path, Sonnet 4.6 for escalation. A common pattern: Haiku handles 80% of common queries instantly and cheaply; complex queries route to Sonnet. This routing alone can cut your bill 60-80% with minimal quality loss.

If you’re doing long-running agentic coding

Opus 4.7. The agentic-coding gains in 4.7 are specifically what justify the price for this work. Sonnet still ships features, but in multi-hour autonomous runs, Opus’s reliability compounds.

If you’re doing research synthesis across hundreds of documents

Sonnet 4.6 first, Opus 4.7 if quality matters more than cost. Both have 1M-token context. Sonnet handles the volume; Opus catches the subtle inferences Sonnet may miss. For high-stakes research (legal briefs, due diligence, policy analysis) the Opus premium is justified.

If you’re classifying support tickets at volume

Haiku 4.5. This is exactly the workload Haiku was built for. At $1/MTok input, you can classify millions of tickets a month for very little money.

If you’re a content team writing long-form articles

Sonnet 4.6. Strong long-form output, lower cost than Opus, and the 64K-token max output covers virtually any single-article writing task. Save Opus for the highest-stakes pieces.

If you’re embedding Claude inside a real-time tool

Haiku 4.5. Latency is dominant. Haiku’s response speed is what makes it feel native inside autocomplete, smart suggest, voice interfaces, and document overlays.

Real cost comparison: a typical workload

Worked example so the price differences are concrete. Suppose your application processes 10,000 customer queries per day. Each query averages 2,000 input tokens (a few documents and the user’s question) and 500 output tokens (the answer).

Daily volume: 20M input tokens + 5M output tokens.
Monthly volume (30 days): 600M input + 150M output.

Cost by model:

Model	Monthly input cost	Monthly output cost	Monthly total
Opus 4.7	$3,000	$3,750	$6,750
Sonnet 4.6	$1,800	$2,250	$4,050
Haiku 4.5	$600	$750	$1,350

Same workload — 5x the cost on Opus vs Haiku. For a chatbot where Haiku quality is sufficient, this is the easiest 80% cost cut you’ll find. For agentic coding where Opus reliability prevents wasted compute downstream, paying the premium often saves money on the other side. Match the model to the actual job.

These costs are gross. Anthropic offers further savings via prompt caching (cheap re-reads of repeated context) and the Batch API (50% discount for non-real-time requests). Production deployments routinely cut their bills 30-50% by combining both.

How do extended thinking and adaptive thinking differ?

One subtle difference in the lineup: Opus 4.7 uses adaptive thinking, while Sonnet 4.6 and Haiku 4.5 use extended thinking. Both let the model “think before answering” on hard problems, but the controls are different.

Extended thinking (Sonnet, Haiku): you turn it on per request. When enabled, the model produces an internal reasoning trace, then the final answer. You pay for the reasoning tokens. Good when you want to see the chain of thought.
Adaptive thinking (Opus 4.7): the model decides for itself how much thinking each request needs. Opus 4.7 doesn’t expose extended-thinking-style traces; instead, it allocates its compute internally. Better for production where you want consistent reliability without managing the thinking budget yourself.

Practically: if you want to inspect or constrain the reasoning, Sonnet’s extended thinking is the right surface. If you want the model to handle reasoning quality without intervention, Opus’s adaptive thinking is.

How do older Claude models compare?

Anthropic still serves several previous-generation models for backward compatibility. They are usable but not recommended for new builds; the current models are strictly better at the same or lower price.

Legacy model	Status	Input	Output
Claude Opus 4.6	Available; superseded by 4.7	$5	$25
Claude Sonnet 4.5	Available; superseded by 4.6	$3	$15
Claude Opus 4.5	Available; superseded by 4.7	$5	$25
Claude Opus 4.1	Available; older generation	$15	$75
Claude Sonnet 4 / Opus 4	Deprecated — retiring June 15, 2026	$3-15	$15-75

If you’re on Sonnet 4 or Opus 4, migrate before June 15, 2026. The new generation is materially better and Anthropic will retire the legacy IDs on that date.

For the individual model deep-dives, see Claude Opus 4.6 (still available, with 4.7 now superseding), Claude Sonnet 4.6, and Claude Haiku 4.5.

Can I switch models mid-conversation?

Yes — in the API. Each request specifies its own model ID, so you can route easy queries to Haiku and hard ones to Sonnet or Opus on a per-request basis. Pattern:

Router-as-Haiku: a fast Haiku call classifies the query first — “is this simple FAQ or complex?”
Worker-as-Sonnet-or-Opus: based on classification, the actual answer comes from the right tier.
This adds ~$0.01 per request for the Haiku classification but typically cuts overall costs by 50-80% on mixed workloads.

In Claude.ai (the consumer app), switching is more limited — the Pro and Max plans handle routing automatically. If you specifically want to choose the model, the API or the Workbench is the way.

Where do these models run?

All three models are available through:

Claude API directly from Anthropic (platform.claude.com). Cheapest, fastest, most features.
Amazon Bedrock — integration with AWS-native tooling, often required for enterprise procurement.
Google Vertex AI — integration with GCP-native tooling. Multi-region routing options.
Microsoft Foundry — integration with Azure and Microsoft 365 enterprise stack.
Claude.ai consumer app — web, iOS, Android, desktop, no API key needed.

For most developers, the direct Claude API is the right choice. The cloud-provider routes exist primarily for enterprise compliance and existing-procurement reasons.

Frequently asked questions

Is one Claude model “smarter” than another?

Yes, in measurable ways. Opus 4.7 outperforms Sonnet 4.6 on hard reasoning and long-horizon agentic tasks; Sonnet outperforms Haiku in turn. The gaps narrow on simple tasks — Haiku and Sonnet are nearly tied on light Q&A — and widen on complex reasoning, where Opus’s premium becomes obvious. Pick the smallest model that’s actually good enough for the job.

Which Claude model has the longest memory?

Both Opus 4.7 and Sonnet 4.6 offer a 1,000,000-token context window (roughly 555-750K words, depending on the language and tokenizer). Haiku 4.5 offers 200K tokens. For most use cases — entire books, codebases, multi-document research — Sonnet’s 1M is already more than enough.

Is Claude Haiku still worth using if Sonnet is the “balanced” option?

Absolutely — especially at scale. Haiku 4.5 is meaningfully smarter than Haiku 3.5, and at $1 input / $5 output per million tokens, it remains the cheapest serious model in the market. Anywhere you need fast latency, embedded use, or high-volume classification, Haiku still wins on cost-per-request.

Should I always use Opus for the highest quality?

No. The marginal quality of Opus over Sonnet is small on most tasks — chat, summarization, document Q&A, casual coding. Save Opus for tasks where the failure mode matters: agentic coding, multi-step planning, sensitive prose, hard reasoning. Otherwise you’re paying premium for parity.

How do I switch model in Claude.ai (the chat app)?

The Pro and Max plans route automatically. There is no per-message model picker for typical users. If you specifically need to choose a model on every request, use the Claude API or the Claude Console / Workbench.

Are there free credits to try the API?

Anthropic typically issues a small free credit on signup — enough to test all three models. The actual amount changes; check the Claude Console when you sign up. For practical exploration, the consumer Claude.ai is also free and now uses tiered routing under the hood.

Will newer Claude models replace these?

Eventually yes. Anthropic releases new model generations roughly every 6-12 months. Existing model IDs are pinned snapshots — claude-opus-4-7 will continue to work until Anthropic announces a deprecation timeline (typically 12+ months notice). Plan migrations but don’t panic when a new generation lands.

Where can I see what’s new in each model?

The clearest summary is in the official documentation at platform.claude.com/docs/about-claude/models/overview and the launch announcement at anthropic.com/news/claude-4. Our What’s New in Claude 2026 guide tracks generation-over-generation changes in plain English.

Stay current on Claude model releases

Get the day each new Claude model lands

Free daily newsletter — one story, one tool, one tip. Plain English, no jargon. One issue per day.

Free forever. Unsubscribe anytime.

Sources

Anthropic. “Models overview.” platform.claude.com/docs/about-claude/models/overview (verified May 14, 2026)
Anthropic. “Pricing.” claude.com/pricing (verified May 14, 2026)
Anthropic. “Migration guide: Opus 4.7.” platform.claude.com/docs/about-claude/models/migration-guide
Anthropic. “Claude 4 launch.” anthropic.com/news/claude-4
Anthropic. “Model deprecations.” platform.claude.com/docs/about-claude/model-deprecations

Power-User Claude Connectors

What Is ChatGPT? A Simple Guide

Best AI Prompts for Restaurants