# Exercise 04: Model Selection & Routing

## Goal
Develop intuition for choosing the right Claude model (Haiku, Sonnet, or Opus) and build a simple routing system that automatically assigns tasks to the appropriate tier.

## Background
Model selection is an economic and quality decision. Haiku costs ~25x less than Opus but handles simple tasks just as well. The skill is knowing which tasks are "simple" and which genuinely require Opus's additional capability.

---

## Part A: Model Selection Scenarios

For each scenario below, choose the best model (Haiku, Sonnet, or Opus) and write a 2-3 sentence justification. Include an estimated cost per call using these rough figures:

- **Haiku:** $0.001 per typical call (1K input + 200 output tokens)
- **Sonnet:** $0.010 per typical call
- **Opus:** $0.050 per typical call

---

**Scenario 1: Spam Filter**
A SaaS platform runs each incoming customer support email through Claude to decide if it's spam (yes/no). Volume: 50,000 emails/day.

- Your model choice: ___________
- Justification:
- Estimated daily cost at this volume:

---

**Scenario 2: Database Schema Design**
A startup CTO asks Claude to design the initial database schema for a multi-tenant SaaS product with complex billing, RBAC, and audit trail requirements. This is a one-off task.

- Your model choice: ___________
- Justification:
- Estimated cost for this task (assume 2K input, 2K output):

---

**Scenario 3: Commit Message Generator**
A git hook calls Claude after every commit to generate a conventional commit message from the diff. Volume: ~30 commits/day across a team of 5.

- Your model choice: ___________
- Justification:
- Estimated monthly cost:

---

**Scenario 4: Security Audit**
A fintech company needs a full security audit of a 3,000-line authentication module. A bug here means compromised user accounts.

- Your model choice: ___________
- Justification:
- Why is cost less important here than in other scenarios?

---

**Scenario 5: Real-Time Autocomplete**
An IDE plugin suggests the next line of code as the developer types. Response must appear within 300ms. Volume: ~500 suggestions/hour per active user.

- Your model choice: ___________
- Justification:
- What constraint eliminates other model options?

---

## Part B: Cost Comparison Exercise

You're building an internal tool that classifies code review comments into categories (bug, style, performance, security, praise). You'll run 200 classifications per day.

Fill in this table:

| Model | Cost per call | Daily cost | Monthly cost | Annual cost |
|-------|--------------|------------|--------------|-------------|
| Haiku | $0.001 | | | |
| Sonnet | $0.010 | | | |
| Opus | $0.050 | | | |

Now answer: What's the annual cost difference between always using Opus vs. always using Haiku for this task?

If Opus has 15% better classification accuracy, is it worth the premium? What does "better accuracy" actually mean for your users in this context?

---

## Part C: Build a Simple Router

Implement a Python function `route_model(task_description: str) -> str` that returns the appropriate model string based on the task.

Your router should:
- Return `"claude-haiku-4-5"` for simple, high-volume, or latency-sensitive tasks
- Return `"claude-sonnet-4-5"` for standard engineering tasks
- Return `"claude-opus-4-5"` for complex, high-stakes, or hard reasoning tasks

Use a keyword/heuristic approach:

```python
def route_model(task_description: str) -> str:
    """
    Route a task to the appropriate Claude model based on task characteristics.
    
    Returns: model string for use in API calls
    """
    task_lower = task_description.lower()
    
    # Haiku signals: high volume, simple, fast, classify, extract, format
    haiku_signals = [
        "classify", "categorize", "extract", "format", "convert",
        "summarize short", "real-time", "autocomplete", "spam",
        "thousands", "per second", "high volume"
    ]
    
    # Opus signals: complex, architecture, security audit, proof, algorithm design
    opus_signals = [
        # TODO: fill in your opus signals
    ]
    
    # TODO: implement the routing logic
    # Hint: check opus_signals first (conservative — only escalate when needed)
    # then check haiku_signals
    # default to sonnet
    
    pass


# Test your router with these inputs:
test_cases = [
    ("classify this support ticket as urgent or not", "haiku"),
    ("design the authentication architecture for our new microservices platform", "opus"),
    ("write a function to parse ISO 8601 dates in TypeScript", "sonnet"),
    ("real-time spell check for a text editor", "haiku"),
    ("prove why this distributed transaction protocol is safe under network partition", "opus"),
    ("generate a git commit message for these changes", "haiku"),
]

for task, expected in test_cases:
    result = route_model(task)
    status = "PASS" if expected in result else "FAIL"
    print(f"{status}: '{task[:50]}...' -> {result} (expected {expected})")
```

---

## Part D: Try-Cheap-Then-Escalate

Implement a `smart_call` function that:
1. Tries Haiku first
2. If the response contains low-confidence signals, retries with Sonnet
3. If Sonnet also seems uncertain, retries with Opus
4. Returns the final response and which model produced it

Low-confidence signals to detect: "I'm not sure", "I don't know", "I cannot", "unclear", "might be", "possibly"

```python
def smart_call(prompt: str, client) -> tuple[str, str]:
    """
    Try cheap models first, escalate only if needed.
    Returns: (response_text, model_used)
    """
    models = ["claude-haiku-4-5", "claude-sonnet-4-5", "claude-opus-4-5"]
    low_confidence_phrases = ["i'm not sure", "i don't know", "i cannot", "unclear", "might be", "possibly"]
    
    # TODO: implement the escalation loop
    pass
```

---

## Success Criteria

- [ ] All 5 scenarios answered with model choice, justification, and cost estimate
- [ ] Cost comparison table completed with annual figures
- [ ] `route_model` function implemented and passes at least 5 of 6 test cases
- [ ] `smart_call` function implemented with working escalation logic
- [ ] You've run your router against 3 real tasks from your own work

## Reflection Questions

- Were any scenarios where you genuinely couldn't decide between two models? What would help you decide?
- What's the biggest risk of being too aggressive about routing to Haiku?
- If you had to justify model spend to a manager, how would you frame the Haiku vs. Opus decision?
