Lesson 12: Choosing Your Model: Opus vs Sonnet vs Haiku

The Three Tiers

Anthropic's Claude comes in three model families, each tuned for a different point on the cost/speed/capability curve:

Model	Best For	Relative Cost	Relative Speed
Claude Haiku	Speed, high volume, simple tasks	~1x (cheapest)	Fastest
Claude Sonnet	Everyday coding and reasoning	~5x	Fast
Claude Opus	Maximum capability, hard problems	~25x	Slower

The default model you use in Claude Code (Sonnet) is a deliberate middle choice — capable enough for most software development work, cheap enough to use all day. But knowing when to reach for Haiku or Opus separates beginners from power users.

Claude Haiku: When Speed and Cost Win

Haiku is your workhorse for tasks where you need volume, latency, or budget control.

Use Haiku for:

Classifying or routing user input ("is this a question or a command?")
Summarizing short documents or log snippets
Extracting structured data from text (receipts, form inputs)
Generating boilerplate or repetitive code variations
Real-time autocomplete or suggestions
Tasks you'll run thousands of times per day

Example: You're building an app that classifies support tickets into categories. Running 10,000 tickets/day through Opus would cost ~25x more than Haiku for no meaningful quality improvement on a simple classification task.

Claude Sonnet: The Default Is Usually Right

Sonnet handles the vast majority of real software development work well. Debugging, refactoring, writing tests, explaining code, implementing features — all of this sits comfortably in Sonnet's capability range.

Use Sonnet for:

Writing and reviewing code
Implementing features with moderate complexity
Multi-step reasoning that isn't mathematically hard
Most interactive developer workflows
Drafting documentation

When in doubt, start here.

Claude Opus: When You Need Maximum Capability

Opus earns its price premium on genuinely hard problems where other models fall short.

Use Opus for:

Architectural decisions with many competing tradeoffs
Debugging subtle, multi-system race conditions
Complex algorithmic problems or proofs
Synthesizing large, contradictory codebases
Generating a plan for a week-long engineering effort
Tasks where a single wrong answer costs hours of human time

The key question: Is the task hard enough that the quality difference justifies 5-25x the cost? For a one-off architectural review, often yes. For daily commit messages, never.

Cost in Practice

At time of writing (approximate, check Anthropic's pricing page for current rates):

Haiku:  ~$0.25 / 1M input tokens   |  ~$1.25 / 1M output tokens
Sonnet: ~$3.00 / 1M input tokens   |  ~$15.00 / 1M output tokens
Opus:   ~$15.00 / 1M input tokens  |  ~$75.00 / 1M output tokens

A typical code-review request might use ~2,000 input tokens and produce ~500 output tokens.

Haiku: ~$0.001 (a tenth of a cent)
Sonnet: ~$0.013
Opus: ~$0.068

Run that 500 times a month:

Haiku: ~$0.50
Sonnet: ~$6.50
Opus: ~$34

The math compounds fast at scale.

Task Routing Strategies

Rather than picking one model for everything, design systems that route tasks to the cheapest model that can handle them.

Try-Cheap-Then-Escalate

def route_task(task: str, content: str):
    # Attempt with Haiku first
    result = call_claude(model="claude-haiku-4-5", task=task, content=content)
    
    # Check if result signals uncertainty
    if result.confidence < 0.8 or "I'm not sure" in result.text:
        # Escalate to Sonnet
        result = call_claude(model="claude-sonnet-4-5", task=task, content=content)
    
    return result

Classify-Then-Route

TASK_ROUTES = {
    "summarize": "claude-haiku-4-5",
    "classify": "claude-haiku-4-5",
    "implement_feature": "claude-sonnet-4-5",
    "architecture_review": "claude-opus-4-5",
    "debug_complex": "claude-opus-4-5",
}

def smart_route(task_type: str, **kwargs):
    model = TASK_ROUTES.get(task_type, "claude-sonnet-4-5")
    return call_claude(model=model, **kwargs)

Specifying Models in API Calls

import anthropic

client = anthropic.Anthropic()

# Haiku for fast/cheap work
response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Classify this support ticket: ..."}]
)

# Sonnet for everyday tasks
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Refactor this function to use async/await..."}]
)

# Opus for hard problems
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Review this distributed system design for consistency issues..."}]
)

Decision Framework

Ask these questions in order:

Is this task repetitive at high volume? → Haiku
Is this a simple extraction, classification, or formatting task? → Haiku
Is this a real-time, latency-sensitive interaction? → Haiku
Is this standard software development work? → Sonnet
Is the problem genuinely hard or high-stakes? → Opus
Would a wrong answer cost more than the price difference? → Opus

Tip: Start every new use case on Sonnet. Move to Haiku once you've validated quality is acceptable. Move to Opus only when Sonnet demonstrably fails.

Key Takeaways

Haiku is ~25x cheaper than Opus — the cost difference is real and compounds at scale
Sonnet is the right default for most developer tasks
Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
Build routing logic into agentic systems rather than hardcoding one model
Always validate quality before downgrading to a cheaper model
Check current pricing at anthropic.com — the numbers shift as models improve