Lesson 19: Choosing Your Model
The Three Tiers
Claude comes in three model tiers, each tuned for a different point on the cost, speed, and capability curve:
| Model | Best For | Relative Input Cost | Relative Output Cost | Speed |
|---|---|---|---|---|
| Haiku | Fast tasks, high volume, simple work | 1x (cheapest) | 1x (cheapest) | Fastest |
| Sonnet | Everyday coding, balanced performance | ~3x Haiku | ~3x Haiku | Fast |
| Opus | Complex reasoning, multi-file refactors | ~5x Haiku | ~5x Haiku | Moderate |
The default in Claude Code is Sonnet -- capable enough for most software development, efficient enough to use all day. Knowing when to reach for Haiku or Opus is what separates beginners from power users.
Switching Models in Claude Code
The `/model` Command
Type /model during any session to open the model picker. You can select from available aliases:
/model opus # Switch to Opus
/model sonnet # Switch to Sonnet
/model haiku # Switch to Haiku
/model opusplan # Opus for planning, Sonnet for executionThe change takes effect immediately -- you do not need to wait for the current response to finish.
The `--model` Flag
Set your model when launching Claude Code:
claude --model opus
claude --model sonnet
claude --model haikuKeyboard Shortcut
Press Alt+P (Windows/Linux) or Option+P (macOS) to switch models without clearing your prompt.
Environment Variable
Set a persistent default:
export ANTHROPIC_MODEL=opusThe `opusplan` Mode
The opusplan alias is a hybrid approach that automatically switches models based on what you are doing:
- In plan mode -- Uses Opus for complex reasoning and architecture decisions
- In execution mode -- Switches to Sonnet for code generation and implementation
This gives you Opus-level reasoning where it matters most (planning, design) and Sonnet-level efficiency where speed matters more (writing code, running tests). It is a practical way to balance quality and cost without manual switching.
/model opusplanEffort Levels (Opus)
Opus supports adaptive reasoning through effort levels, which control how deeply the model thinks before responding. Lower effort is faster and cheaper for straightforward tasks; higher effort provides deeper reasoning for complex problems.
Three levels are available: low, medium, and high (default).
Setting effort:
- In
/model: Use left/right arrow keys to adjust the effort slider when Opus is selected - Environment variable:
CLAUDE_CODE_EFFORT_LEVEL=low|medium|high - Settings file: Set
effortLevelin your settings
| Effort | When to Use |
|---|---|
| Low | Simple questions, boilerplate generation, quick lookups |
| Medium | Standard development tasks with moderate complexity |
| High (default) | Complex debugging, architecture decisions, subtle bugs |
Think of effort levels as a cost dial within Opus itself -- you get the same model, but you control how much reasoning it applies.
Extended Thinking
Extended thinking lets Claude reason step-by-step in a hidden scratchpad before producing its final answer. It is available on both Opus and Sonnet.
In Claude Code, toggle extended thinking with:
Alt+T(Windows/Linux) orOption+T(macOS)- The
/configcommand to enable it by default
When enabled, thinking tokens are billed at output token rates. You can control the budget with the MAX_THINKING_TOKENS environment variable:
# Set a custom thinking token budget
export MAX_THINKING_TOKENS=10000
# Disable thinking entirely
export MAX_THINKING_TOKENS=0Note: For Opus, thinking depth is controlled by effort levels instead, and
MAX_THINKING_TOKENSis ignored unless set to0to disable thinking entirely.
Extended thinking is most valuable for complex multi-step reasoning, ambiguous problems, and debugging subtle issues. For routine tasks, it adds cost without meaningful quality improvement. See the Extended Thinking lesson for a deeper dive.
Model Characteristics
Haiku: Speed and Efficiency
Haiku is the fastest and cheapest model. Use it when speed or volume matters more than peak intelligence.
Use Haiku for:
- Classifying or routing input ("is this a question or a command?")
- Summarizing short documents or log snippets
- Extracting structured data from text
- Generating boilerplate or repetitive code variations
- Quick explanations of simple concepts
- Any task you need to run thousands of times
Sonnet: The Everyday Workhorse
Sonnet handles the vast majority of real software development work well. It is the right default for most developers.
Use Sonnet for:
- Writing and reviewing code
- Implementing features with moderate complexity
- Multi-step reasoning that is not mathematically hard
- Interactive development workflows
- Drafting documentation
- Test generation and debugging
When in doubt, start with Sonnet.
Opus: Maximum Capability
Opus is the most capable model, earning its price premium on genuinely hard problems.
Use Opus for:
- Architectural decisions with many competing tradeoffs
- Debugging subtle, multi-system race conditions
- Complex algorithmic problems
- Synthesizing large, contradictory codebases
- Generating plans for major engineering efforts
- Tasks where a single wrong answer costs hours of human time
The key question: Is the task hard enough that the quality difference justifies the cost? For a one-off architectural review, Opus is often worth it. For daily commit messages, it never is.
Decision Framework
Ask these questions in order:
- Is this a simple extraction, classification, or formatting task? --> Haiku
- Am I running this at high volume or in a latency-sensitive pipeline? --> Haiku
- Is this standard software development work? --> Sonnet
- Is the problem genuinely hard, high-stakes, or requires deep reasoning? --> Opus
- Do I need great planning but fast execution? -->
opusplan - Would a wrong answer cost more than the price difference? --> Opus
Tip: Start every new use case on Sonnet. Move to Haiku once you have validated quality is acceptable. Move to Opus only when Sonnet demonstrably falls short.
Relative Cost Comparisons
The relative cost ratios between tiers are what matter most for planning:
- Opus costs ~5x Haiku on both input and output tokens
- Sonnet costs ~3x Haiku on both input and output tokens
- Opus costs ~1.7x Sonnet -- a modest premium for the most capable model
These ratios mean that running 1,000 requests through Opus costs about 5x what the same workload costs on Haiku. At scale, this adds up fast -- but for one-off complex tasks, the premium is often negligible.
Other Cost Levers
Beyond model selection, several features affect your total cost:
| Feature | Effect on Cost |
|---|---|
| Extended thinking | Thinking tokens billed at output rates; can significantly increase per-request cost |
| Effort levels (Opus) | Lower effort = fewer thinking tokens = lower cost per request |
| Prompt caching | Cache hits cost ~10x less than base input tokens; huge savings for repeated context |
| Batch API | 50% discount on both input and output tokens for async workloads |
| 1M context window | Requests exceeding 200K tokens are billed at premium long-context rates |
Note: Check anthropic.com/pricing for current per-token rates. Absolute prices shift as new models are released, but the tier ratios remain a useful planning tool.
Task Routing in the API
When building applications with the API, route tasks to the cheapest model that can handle them rather than hardcoding a single model.
Classify-Then-Route
# Check docs.anthropic.com for latest model IDs
TASK_ROUTES = {
"summarize": "claude-haiku-4-5",
"classify": "claude-haiku-4-5",
"implement_feature": "claude-sonnet-4-6",
"code_review": "claude-sonnet-4-6",
"architecture_review": "claude-opus-4-6",
"debug_complex": "claude-opus-4-6",
}
def smart_route(task_type: str, **kwargs):
# Check docs.anthropic.com for latest model IDs
model = TASK_ROUTES.get(task_type, "claude-sonnet-4-6")
return call_claude(model=model, **kwargs)Try-Cheap-Then-Escalate
def route_task(task: str, content: str):
# Check docs.anthropic.com for latest model IDs
result = call_claude(model="claude-haiku-4-5", task=task, content=content)
if result.confidence < 0.8 or "I'm not sure" in result.text:
# Escalate to Sonnet
result = call_claude(model="claude-sonnet-4-6", task=task, content=content)
return resultPractical Recommendations
For daily Claude Code usage:
| Situation | Model | Why |
|---|---|---|
| Exploring a new codebase | Sonnet | Fast answers, good comprehension |
| Writing a new feature | Sonnet | Balanced speed and quality |
| Complex multi-file refactor | Opus | Needs to reason across many files |
| Architectural planning | Opus or opusplan |
Deep reasoning for design decisions |
| Quick "what does this do?" | Haiku or Sonnet | Speed matters more than depth |
| Debugging a subtle race condition | Opus (high effort) | Needs maximum reasoning capability |
| Generating test boilerplate | Sonnet or Haiku | Repetitive, pattern-based work |
| Long planning session with execution | opusplan |
Best reasoning for plans, efficient execution |
For API development:
- Default to Sonnet for most endpoints
- Use Haiku for classification, routing, and extraction tasks
- Reserve Opus for your hardest problems or where errors are costly
- Implement routing logic rather than picking one model for everything
- Always validate quality before downgrading to a cheaper model
Key Takeaways
- Opus costs ~5x Haiku and ~1.7x Sonnet -- meaningful at scale, negligible for one-off tasks
- Sonnet is the right default for most developer workflows in Claude Code
- Opus shines on genuinely difficult, high-stakes, or complex reasoning problems
- Use
opusplanto get Opus-quality planning with Sonnet-speed execution - Effort levels let you tune Opus cost and speed without switching models
- Build routing logic into applications rather than hardcoding one model
- Check current pricing at anthropic.com/pricing -- absolute prices shift but tier ratios remain a useful guide