Lesson 12: Choosing Your Model: Opus vs Sonnet vs Haiku
The Three Tiers
Anthropic's Claude comes in three model families, each tuned for a different point on the cost/speed/capability curve:
| Model | Best For | Relative Cost | Relative Speed |
|---|---|---|---|
| Claude Haiku | Speed, high volume, simple tasks | ~1x (cheapest) | Fastest |
| Claude Sonnet | Everyday coding and reasoning | ~5x | Fast |
| Claude Opus | Maximum capability, hard problems | ~25x | Slower |
The default model you use in Claude Code (Sonnet) is a deliberate middle choice — capable enough for most software development work, cheap enough to use all day. But knowing when to reach for Haiku or Opus separates beginners from power users.
Claude Haiku: When Speed and Cost Win
Haiku is your workhorse for tasks where you need volume, latency, or budget control.
Use Haiku for:
- Classifying or routing user input ("is this a question or a command?")
- Summarizing short documents or log snippets
- Extracting structured data from text (receipts, form inputs)
- Generating boilerplate or repetitive code variations
- Real-time autocomplete or suggestions
- Tasks you'll run thousands of times per day
Example: You're building an app that classifies support tickets into categories. Running 10,000 tickets/day through Opus would cost ~25x more than Haiku for no meaningful quality improvement on a simple classification task.
Claude Sonnet: The Default Is Usually Right
Sonnet handles the vast majority of real software development work well. Debugging, refactoring, writing tests, explaining code, implementing features — all of this sits comfortably in Sonnet's capability range.
Use Sonnet for:
- Writing and reviewing code
- Implementing features with moderate complexity
- Multi-step reasoning that isn't mathematically hard
- Most interactive developer workflows
- Drafting documentation
When in doubt, start here.
Claude Opus: When You Need Maximum Capability
Opus earns its price premium on genuinely hard problems where other models fall short.
Use Opus for:
- Architectural decisions with many competing tradeoffs
- Debugging subtle, multi-system race conditions
- Complex algorithmic problems or proofs
- Synthesizing large, contradictory codebases
- Generating a plan for a week-long engineering effort
- Tasks where a single wrong answer costs hours of human time
The key question: Is the task hard enough that the quality difference justifies 5-25x the cost? For a one-off architectural review, often yes. For daily commit messages, never.
Cost in Practice
At time of writing (approximate, check Anthropic's pricing page for current rates):
Haiku: ~$0.25 / 1M input tokens | ~$1.25 / 1M output tokens
Sonnet: ~$3.00 / 1M input tokens | ~$15.00 / 1M output tokens
Opus: ~$15.00 / 1M input tokens | ~$75.00 / 1M output tokens
A typical code-review request might use ~2,000 input tokens and produce ~500 output tokens.
- Haiku: ~$0.001 (a tenth of a cent)
- Sonnet: ~$0.013
- Opus: ~$0.068
Run that 500 times a month:
- Haiku: ~$0.50
- Sonnet: ~$6.50
- Opus: ~$34
The math compounds fast at scale.
Task Routing Strategies
Rather than picking one model for everything, design systems that route tasks to the cheapest model that can handle them.
Try-Cheap-Then-Escalate
def route_task(task: str, content: str):
# Attempt with Haiku first
result = call_claude(model="claude-haiku-4-5", task=task, content=content)
# Check if result signals uncertainty
if result.confidence < 0.8 or "I'm not sure" in result.text:
# Escalate to Sonnet
result = call_claude(model="claude-sonnet-4-5", task=task, content=content)
return result
Classify-Then-Route
TASK_ROUTES = {
"summarize": "claude-haiku-4-5",
"classify": "claude-haiku-4-5",
"implement_feature": "claude-sonnet-4-5",
"architecture_review": "claude-opus-4-5",
"debug_complex": "claude-opus-4-5",
}
def smart_route(task_type: str, **kwargs):
model = TASK_ROUTES.get(task_type, "claude-sonnet-4-5")
return call_claude(model=model, **kwargs)
Specifying Models in API Calls
import anthropic
client = anthropic.Anthropic()
# Haiku for fast/cheap work
response = client.messages.create(
model="claude-haiku-4-5",
max_tokens=256,
messages=[{"role": "user", "content": "Classify this support ticket: ..."}]
)
# Sonnet for everyday tasks
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "Refactor this function to use async/await..."}]
)
# Opus for hard problems
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=4096,
messages=[{"role": "user", "content": "Review this distributed system design for consistency issues..."}]
)
Decision Framework
Ask these questions in order:
- Is this task repetitive at high volume? → Haiku
- Is this a simple extraction, classification, or formatting task? → Haiku
- Is this a real-time, latency-sensitive interaction? → Haiku
- Is this standard software development work? → Sonnet
- Is the problem genuinely hard or high-stakes? → Opus
- Would a wrong answer cost more than the price difference? → Opus
Tip: Start every new use case on Sonnet. Move to Haiku once you've validated quality is acceptable. Move to Opus only when Sonnet demonstrably fails.
Key Takeaways
- Haiku is ~25x cheaper than Opus — the cost difference is real and compounds at scale
- Sonnet is the right default for most developer tasks
- Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
- Build routing logic into agentic systems rather than hardcoding one model
- Always validate quality before downgrading to a cheaper model
- Check current pricing at anthropic.com — the numbers shift as models improve