Lesson 13 of 20

Lesson 12: Choosing Your Model: Opus vs Sonnet vs Haiku

The Three Tiers

Anthropic's Claude comes in three model families, each tuned for a different point on the cost/speed/capability curve:

Model Best For Relative Cost Relative Speed
Claude Haiku Speed, high volume, simple tasks ~1x (cheapest) Fastest
Claude Sonnet Everyday coding and reasoning ~5x Fast
Claude Opus Maximum capability, hard problems ~25x Slower

The default model you use in Claude Code (Sonnet) is a deliberate middle choice — capable enough for most software development work, cheap enough to use all day. But knowing when to reach for Haiku or Opus separates beginners from power users.


Claude Haiku: When Speed and Cost Win

Haiku is your workhorse for tasks where you need volume, latency, or budget control.

Use Haiku for:

  • Classifying or routing user input ("is this a question or a command?")
  • Summarizing short documents or log snippets
  • Extracting structured data from text (receipts, form inputs)
  • Generating boilerplate or repetitive code variations
  • Real-time autocomplete or suggestions
  • Tasks you'll run thousands of times per day

Example: You're building an app that classifies support tickets into categories. Running 10,000 tickets/day through Opus would cost ~25x more than Haiku for no meaningful quality improvement on a simple classification task.


Claude Sonnet: The Default Is Usually Right

Sonnet handles the vast majority of real software development work well. Debugging, refactoring, writing tests, explaining code, implementing features — all of this sits comfortably in Sonnet's capability range.

Use Sonnet for:

  • Writing and reviewing code
  • Implementing features with moderate complexity
  • Multi-step reasoning that isn't mathematically hard
  • Most interactive developer workflows
  • Drafting documentation

When in doubt, start here.


Claude Opus: When You Need Maximum Capability

Opus earns its price premium on genuinely hard problems where other models fall short.

Use Opus for:

  • Architectural decisions with many competing tradeoffs
  • Debugging subtle, multi-system race conditions
  • Complex algorithmic problems or proofs
  • Synthesizing large, contradictory codebases
  • Generating a plan for a week-long engineering effort
  • Tasks where a single wrong answer costs hours of human time

The key question: Is the task hard enough that the quality difference justifies 5-25x the cost? For a one-off architectural review, often yes. For daily commit messages, never.


Cost in Practice

At time of writing (approximate, check Anthropic's pricing page for current rates):

Haiku:  ~$0.25 / 1M input tokens   |  ~$1.25 / 1M output tokens
Sonnet: ~$3.00 / 1M input tokens   |  ~$15.00 / 1M output tokens
Opus:   ~$15.00 / 1M input tokens  |  ~$75.00 / 1M output tokens

A typical code-review request might use ~2,000 input tokens and produce ~500 output tokens.

  • Haiku: ~$0.001 (a tenth of a cent)
  • Sonnet: ~$0.013
  • Opus: ~$0.068

Run that 500 times a month:

  • Haiku: ~$0.50
  • Sonnet: ~$6.50
  • Opus: ~$34

The math compounds fast at scale.


Task Routing Strategies

Rather than picking one model for everything, design systems that route tasks to the cheapest model that can handle them.

Try-Cheap-Then-Escalate

def route_task(task: str, content: str):
    # Attempt with Haiku first
    result = call_claude(model="claude-haiku-4-5", task=task, content=content)
    
    # Check if result signals uncertainty
    if result.confidence < 0.8 or "I'm not sure" in result.text:
        # Escalate to Sonnet
        result = call_claude(model="claude-sonnet-4-5", task=task, content=content)
    
    return result

Classify-Then-Route

TASK_ROUTES = {
    "summarize": "claude-haiku-4-5",
    "classify": "claude-haiku-4-5",
    "implement_feature": "claude-sonnet-4-5",
    "architecture_review": "claude-opus-4-5",
    "debug_complex": "claude-opus-4-5",
}

def smart_route(task_type: str, **kwargs):
    model = TASK_ROUTES.get(task_type, "claude-sonnet-4-5")
    return call_claude(model=model, **kwargs)

Specifying Models in API Calls

import anthropic

client = anthropic.Anthropic()

# Haiku for fast/cheap work
response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Classify this support ticket: ..."}]
)

# Sonnet for everyday tasks
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Refactor this function to use async/await..."}]
)

# Opus for hard problems
response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=4096,
    messages=[{"role": "user", "content": "Review this distributed system design for consistency issues..."}]
)

Decision Framework

Ask these questions in order:

  1. Is this task repetitive at high volume? → Haiku
  2. Is this a simple extraction, classification, or formatting task? → Haiku
  3. Is this a real-time, latency-sensitive interaction? → Haiku
  4. Is this standard software development work? → Sonnet
  5. Is the problem genuinely hard or high-stakes? → Opus
  6. Would a wrong answer cost more than the price difference? → Opus

Tip: Start every new use case on Sonnet. Move to Haiku once you've validated quality is acceptable. Move to Opus only when Sonnet demonstrably fails.


Key Takeaways

  • Haiku is ~25x cheaper than Opus — the cost difference is real and compounds at scale
  • Sonnet is the right default for most developer tasks
  • Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
  • Build routing logic into agentic systems rather than hardcoding one model
  • Always validate quality before downgrading to a cheaper model
  • Check current pricing at anthropic.com — the numbers shift as models improve