On this page
Lesson 20 of 30

Lesson 19: Choosing Your Model

The Three Tiers

Claude comes in three model tiers, each tuned for a different point on the cost, speed, and capability curve:

Model Best For Relative Input Cost Relative Output Cost Speed
Haiku Fast tasks, high volume, simple work 1x (cheapest) 1x (cheapest) Fastest
Sonnet Everyday coding, balanced performance ~3x Haiku ~3x Haiku Fast
Opus Complex reasoning, multi-file refactors ~5x Haiku ~5x Haiku Moderate

The default in Claude Code is Sonnet -- capable enough for most software development, efficient enough to use all day. Knowing when to reach for Haiku or Opus is what separates beginners from power users.


Switching Models in Claude Code

The `/model` Command

Type /model during any session to open the model picker. You can select from available aliases:

/model opus       # Switch to Opus
/model sonnet     # Switch to Sonnet
/model haiku      # Switch to Haiku
/model opusplan   # Opus for planning, Sonnet for execution

The change takes effect immediately -- you do not need to wait for the current response to finish.

The `--model` Flag

Set your model when launching Claude Code:

Bash
claude --model opus
claude --model sonnet
claude --model haiku

Keyboard Shortcut

Press Alt+P (Windows/Linux) or Option+P (macOS) to switch models without clearing your prompt.

Environment Variable

Set a persistent default:

Bash
export ANTHROPIC_MODEL=opus

The `opusplan` Mode

The opusplan alias is a hybrid approach that automatically switches models based on what you are doing:

  • In plan mode -- Uses Opus for complex reasoning and architecture decisions
  • In execution mode -- Switches to Sonnet for code generation and implementation

This gives you Opus-level reasoning where it matters most (planning, design) and Sonnet-level efficiency where speed matters more (writing code, running tests). It is a practical way to balance quality and cost without manual switching.

/model opusplan

Effort Levels (Opus)

Opus supports adaptive reasoning through effort levels, which control how deeply the model thinks before responding. Lower effort is faster and cheaper for straightforward tasks; higher effort provides deeper reasoning for complex problems.

Three levels are available: low, medium, and high (default).

Setting effort:

  • In /model: Use left/right arrow keys to adjust the effort slider when Opus is selected
  • Environment variable: CLAUDE_CODE_EFFORT_LEVEL=low|medium|high
  • Settings file: Set effortLevel in your settings
Effort When to Use
Low Simple questions, boilerplate generation, quick lookups
Medium Standard development tasks with moderate complexity
High (default) Complex debugging, architecture decisions, subtle bugs

Think of effort levels as a cost dial within Opus itself -- you get the same model, but you control how much reasoning it applies.


Extended Thinking

Extended thinking lets Claude reason step-by-step in a hidden scratchpad before producing its final answer. It is available on both Opus and Sonnet.

In Claude Code, toggle extended thinking with:

  • Alt+T (Windows/Linux) or Option+T (macOS)
  • The /config command to enable it by default

When enabled, thinking tokens are billed at output token rates. You can control the budget with the MAX_THINKING_TOKENS environment variable:

Bash
# Set a custom thinking token budget
export MAX_THINKING_TOKENS=10000

# Disable thinking entirely
export MAX_THINKING_TOKENS=0

Note: For Opus, thinking depth is controlled by effort levels instead, and MAX_THINKING_TOKENS is ignored unless set to 0 to disable thinking entirely.

Extended thinking is most valuable for complex multi-step reasoning, ambiguous problems, and debugging subtle issues. For routine tasks, it adds cost without meaningful quality improvement. See the Extended Thinking lesson for a deeper dive.


Model Characteristics

Haiku: Speed and Efficiency

Haiku is the fastest and cheapest model. Use it when speed or volume matters more than peak intelligence.

Use Haiku for:

  • Classifying or routing input ("is this a question or a command?")
  • Summarizing short documents or log snippets
  • Extracting structured data from text
  • Generating boilerplate or repetitive code variations
  • Quick explanations of simple concepts
  • Any task you need to run thousands of times

Sonnet: The Everyday Workhorse

Sonnet handles the vast majority of real software development work well. It is the right default for most developers.

Use Sonnet for:

  • Writing and reviewing code
  • Implementing features with moderate complexity
  • Multi-step reasoning that is not mathematically hard
  • Interactive development workflows
  • Drafting documentation
  • Test generation and debugging

When in doubt, start with Sonnet.

Opus: Maximum Capability

Opus is the most capable model, earning its price premium on genuinely hard problems.

Use Opus for:

  • Architectural decisions with many competing tradeoffs
  • Debugging subtle, multi-system race conditions
  • Complex algorithmic problems
  • Synthesizing large, contradictory codebases
  • Generating plans for major engineering efforts
  • Tasks where a single wrong answer costs hours of human time

The key question: Is the task hard enough that the quality difference justifies the cost? For a one-off architectural review, Opus is often worth it. For daily commit messages, it never is.


Decision Framework

Ask these questions in order:

  1. Is this a simple extraction, classification, or formatting task? --> Haiku
  2. Am I running this at high volume or in a latency-sensitive pipeline? --> Haiku
  3. Is this standard software development work? --> Sonnet
  4. Is the problem genuinely hard, high-stakes, or requires deep reasoning? --> Opus
  5. Do I need great planning but fast execution? --> opusplan
  6. Would a wrong answer cost more than the price difference? --> Opus

Tip: Start every new use case on Sonnet. Move to Haiku once you have validated quality is acceptable. Move to Opus only when Sonnet demonstrably falls short.


Relative Cost Comparisons

The relative cost ratios between tiers are what matter most for planning:

  • Opus costs ~5x Haiku on both input and output tokens
  • Sonnet costs ~3x Haiku on both input and output tokens
  • Opus costs ~1.7x Sonnet -- a modest premium for the most capable model

These ratios mean that running 1,000 requests through Opus costs about 5x what the same workload costs on Haiku. At scale, this adds up fast -- but for one-off complex tasks, the premium is often negligible.

Other Cost Levers

Beyond model selection, several features affect your total cost:

Feature Effect on Cost
Extended thinking Thinking tokens billed at output rates; can significantly increase per-request cost
Effort levels (Opus) Lower effort = fewer thinking tokens = lower cost per request
Prompt caching Cache hits cost ~10x less than base input tokens; huge savings for repeated context
Batch API 50% discount on both input and output tokens for async workloads
1M context window Requests exceeding 200K tokens are billed at premium long-context rates

Note: Check anthropic.com/pricing for current per-token rates. Absolute prices shift as new models are released, but the tier ratios remain a useful planning tool.


Task Routing in the API

When building applications with the API, route tasks to the cheapest model that can handle them rather than hardcoding a single model.

Classify-Then-Route

Python
# Check docs.anthropic.com for latest model IDs
TASK_ROUTES = {
    "summarize": "claude-haiku-4-5",
    "classify": "claude-haiku-4-5",
    "implement_feature": "claude-sonnet-4-6",
    "code_review": "claude-sonnet-4-6",
    "architecture_review": "claude-opus-4-6",
    "debug_complex": "claude-opus-4-6",
}

def smart_route(task_type: str, **kwargs):
    # Check docs.anthropic.com for latest model IDs
    model = TASK_ROUTES.get(task_type, "claude-sonnet-4-6")
    return call_claude(model=model, **kwargs)

Try-Cheap-Then-Escalate

Python
def route_task(task: str, content: str):
    # Check docs.anthropic.com for latest model IDs
    result = call_claude(model="claude-haiku-4-5", task=task, content=content)

    if result.confidence < 0.8 or "I'm not sure" in result.text:
        # Escalate to Sonnet
        result = call_claude(model="claude-sonnet-4-6", task=task, content=content)

    return result

Practical Recommendations

For daily Claude Code usage:

Situation Model Why
Exploring a new codebase Sonnet Fast answers, good comprehension
Writing a new feature Sonnet Balanced speed and quality
Complex multi-file refactor Opus Needs to reason across many files
Architectural planning Opus or opusplan Deep reasoning for design decisions
Quick "what does this do?" Haiku or Sonnet Speed matters more than depth
Debugging a subtle race condition Opus (high effort) Needs maximum reasoning capability
Generating test boilerplate Sonnet or Haiku Repetitive, pattern-based work
Long planning session with execution opusplan Best reasoning for plans, efficient execution

For API development:

  • Default to Sonnet for most endpoints
  • Use Haiku for classification, routing, and extraction tasks
  • Reserve Opus for your hardest problems or where errors are costly
  • Implement routing logic rather than picking one model for everything
  • Always validate quality before downgrading to a cheaper model

Key Takeaways

  • Opus costs ~5x Haiku and ~1.7x Sonnet -- meaningful at scale, negligible for one-off tasks
  • Sonnet is the right default for most developer workflows in Claude Code
  • Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
  • Use opusplan to get Opus-quality planning with Sonnet-speed execution
  • Effort levels let you tune Opus cost and speed without switching models
  • Build routing logic into applications rather than hardcoding one model
  • Check current pricing at anthropic.com/pricing -- absolute prices shift but tier ratios remain a useful guide