Lesson 19: Choosing Your Model

The Three Tiers

Claude comes in three model tiers, each tuned for a different point on the cost, speed, and capability curve:

Model	Best For	Relative Input Cost	Relative Output Cost	Speed
Haiku	Fast tasks, high volume, simple work	1x (cheapest)	1x (cheapest)	Fastest
Sonnet	Everyday coding, balanced performance	~3x Haiku	~3x Haiku	Fast
Opus	Complex reasoning, multi-file refactors	~5x Haiku	~5x Haiku	Moderate

The default in Claude Code is Sonnet -- capable enough for most software development, efficient enough to use all day. Knowing when to reach for Haiku or Opus is what separates beginners from power users.

Switching Models in Claude Code

The `/model` Command

Type /model during any session to open the model picker. You can select from available aliases:

/model opus       # Switch to Opus
/model sonnet     # Switch to Sonnet
/model haiku      # Switch to Haiku
/model opusplan   # Opus for planning, Sonnet for execution

The change takes effect immediately -- you do not need to wait for the current response to finish.

The `--model` Flag

Set your model when launching Claude Code:

claude --model opus
claude --model sonnet
claude --model haiku

Keyboard Shortcut

Press Alt+P (Windows/Linux) or Option+P (macOS) to switch models without clearing your prompt.

Environment Variable

Set a persistent default:

export ANTHROPIC_MODEL=opus

The `opusplan` Mode

The opusplan alias is a hybrid approach that automatically switches models based on what you are doing:

In plan mode -- Uses Opus for complex reasoning and architecture decisions
In execution mode -- Switches to Sonnet for code generation and implementation

This gives you Opus-level reasoning where it matters most (planning, design) and Sonnet-level efficiency where speed matters more (writing code, running tests). It is a practical way to balance quality and cost without manual switching.

/model opusplan

Effort Levels (Opus)

Opus supports adaptive reasoning through effort levels, which control how deeply the model thinks before responding. Lower effort is faster and cheaper for straightforward tasks; higher effort provides deeper reasoning for complex problems.

Three levels are available: low, medium, and high (default).

Setting effort:

In /model: Use left/right arrow keys to adjust the effort slider when Opus is selected
Environment variable: CLAUDE_CODE_EFFORT_LEVEL=low|medium|high
Settings file: Set effortLevel in your settings

Effort	When to Use
Low	Simple questions, boilerplate generation, quick lookups
Medium	Standard development tasks with moderate complexity
High (default)	Complex debugging, architecture decisions, subtle bugs

Think of effort levels as a cost dial within Opus itself -- you get the same model, but you control how much reasoning it applies.

Extended Thinking

Extended thinking lets Claude reason step-by-step in a hidden scratchpad before producing its final answer. It is available on both Opus and Sonnet.

In Claude Code, toggle extended thinking with:

Alt+T (Windows/Linux) or Option+T (macOS)
The /config command to enable it by default

When enabled, thinking tokens are billed at output token rates. You can control the budget with the MAX_THINKING_TOKENS environment variable:

# Set a custom thinking token budget
export MAX_THINKING_TOKENS=10000

# Disable thinking entirely
export MAX_THINKING_TOKENS=0

Note: For Opus, thinking depth is controlled by effort levels instead, and MAX_THINKING_TOKENS is ignored unless set to 0 to disable thinking entirely.

Extended thinking is most valuable for complex multi-step reasoning, ambiguous problems, and debugging subtle issues. For routine tasks, it adds cost without meaningful quality improvement. See the Extended Thinking lesson for a deeper dive.

Model Characteristics

Haiku: Speed and Efficiency

Haiku is the fastest and cheapest model. Use it when speed or volume matters more than peak intelligence.

Use Haiku for:

Classifying or routing input ("is this a question or a command?")
Summarizing short documents or log snippets
Extracting structured data from text
Generating boilerplate or repetitive code variations
Quick explanations of simple concepts
Any task you need to run thousands of times

Sonnet: The Everyday Workhorse

Sonnet handles the vast majority of real software development work well. It is the right default for most developers.

Use Sonnet for:

Writing and reviewing code
Implementing features with moderate complexity
Multi-step reasoning that is not mathematically hard
Interactive development workflows
Drafting documentation
Test generation and debugging

When in doubt, start with Sonnet.

Opus: Maximum Capability

Opus is the most capable model, earning its price premium on genuinely hard problems.

Use Opus for:

Architectural decisions with many competing tradeoffs
Debugging subtle, multi-system race conditions
Complex algorithmic problems
Synthesizing large, contradictory codebases
Generating plans for major engineering efforts
Tasks where a single wrong answer costs hours of human time

The key question: Is the task hard enough that the quality difference justifies the cost? For a one-off architectural review, Opus is often worth it. For daily commit messages, it never is.

Decision Framework

Ask these questions in order:

Is this a simple extraction, classification, or formatting task? --> Haiku
Am I running this at high volume or in a latency-sensitive pipeline? --> Haiku
Is this standard software development work? --> Sonnet
Is the problem genuinely hard, high-stakes, or requires deep reasoning? --> Opus
Do I need great planning but fast execution? --> opusplan
Would a wrong answer cost more than the price difference? --> Opus

Tip: Start every new use case on Sonnet. Move to Haiku once you have validated quality is acceptable. Move to Opus only when Sonnet demonstrably falls short.

Relative Cost Comparisons

The relative cost ratios between tiers are what matter most for planning:

Opus costs ~5x Haiku on both input and output tokens
Sonnet costs ~3x Haiku on both input and output tokens
Opus costs ~1.7x Sonnet -- a modest premium for the most capable model

These ratios mean that running 1,000 requests through Opus costs about 5x what the same workload costs on Haiku. At scale, this adds up fast -- but for one-off complex tasks, the premium is often negligible.

Other Cost Levers

Beyond model selection, several features affect your total cost:

Feature	Effect on Cost
Extended thinking	Thinking tokens billed at output rates; can significantly increase per-request cost
Effort levels (Opus)	Lower effort = fewer thinking tokens = lower cost per request
Prompt caching	Cache hits cost ~10x less than base input tokens; huge savings for repeated context
Batch API	50% discount on both input and output tokens for async workloads
1M context window	Requests exceeding 200K tokens are billed at premium long-context rates

Note: Check anthropic.com/pricing for current per-token rates. Absolute prices shift as new models are released, but the tier ratios remain a useful planning tool.

Task Routing in the API

When building applications with the API, route tasks to the cheapest model that can handle them rather than hardcoding a single model.

Classify-Then-Route

# Check docs.anthropic.com for latest model IDs
TASK_ROUTES = {
    "summarize": "claude-haiku-4-5",
    "classify": "claude-haiku-4-5",
    "implement_feature": "claude-sonnet-4-6",
    "code_review": "claude-sonnet-4-6",
    "architecture_review": "claude-opus-4-6",
    "debug_complex": "claude-opus-4-6",
}

def smart_route(task_type: str, **kwargs):
    # Check docs.anthropic.com for latest model IDs
    model = TASK_ROUTES.get(task_type, "claude-sonnet-4-6")
    return call_claude(model=model, **kwargs)

Try-Cheap-Then-Escalate

def route_task(task: str, content: str):
    # Check docs.anthropic.com for latest model IDs
    result = call_claude(model="claude-haiku-4-5", task=task, content=content)

    if result.confidence < 0.8 or "I'm not sure" in result.text:
        # Escalate to Sonnet
        result = call_claude(model="claude-sonnet-4-6", task=task, content=content)

    return result

Practical Recommendations

For daily Claude Code usage:

Situation	Model	Why
Exploring a new codebase	Sonnet	Fast answers, good comprehension
Writing a new feature	Sonnet	Balanced speed and quality
Complex multi-file refactor	Opus	Needs to reason across many files
Architectural planning	Opus or `opusplan`	Deep reasoning for design decisions
Quick "what does this do?"	Haiku or Sonnet	Speed matters more than depth
Debugging a subtle race condition	Opus (high effort)	Needs maximum reasoning capability
Generating test boilerplate	Sonnet or Haiku	Repetitive, pattern-based work
Long planning session with execution	`opusplan`	Best reasoning for plans, efficient execution

For API development:

Default to Sonnet for most endpoints
Use Haiku for classification, routing, and extraction tasks
Reserve Opus for your hardest problems or where errors are costly
Implement routing logic rather than picking one model for everything
Always validate quality before downgrading to a cheaper model

Key Takeaways

Opus costs ~5x Haiku and ~1.7x Sonnet -- meaningful at scale, negligible for one-off tasks
Sonnet is the right default for most developer workflows in Claude Code
Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
Use opusplan to get Opus-quality planning with Sonnet-speed execution
Effort levels let you tune Opus cost and speed without switching models
Build routing logic into applications rather than hardcoding one model
Check current pricing at anthropic.com/pricing -- absolute prices shift but tier ratios remain a useful guide