Claude Code supports multiple models with different strengths, speeds, and costs. Picking the right model for the task avoids overpaying for simple work and underperforming on complex work.
Available models
| Model | Alias | Best for | Speed | Cost |
|---|---|---|---|---|
| Opus 4.6 | opus | Complex architecture, subtle bugs, multi-file reasoning | Slower | Highest |
| Sonnet 4.6 | sonnet | Daily coding, feature work, code review, most tasks | Fast | Moderate |
| Haiku 4.5 | haiku | Quick questions, simple generation, boilerplate | Fastest | Lowest |
Sonnet is the default and the right choice for most work. Use Opus when you need
Claude to hold more complexity in its head simultaneously. Use Haiku for tasks
where speed matters more than depth. You can also use full model IDs like claude-opus-4-6 or claude-sonnet-4-6 if you need a specific version.
Switching models
# During a session
/model sonnet
/model opus
# At launch
claude --model opus
claude --model sonnet
You can also press Option+P (macOS) or Alt+P (Linux/Windows) to switch models
without clearing your prompt.
Effort levels
Effort controls how much reasoning Claude does before responding. Lower effort means faster, cheaper responses. Higher effort means deeper analysis.
/effort low # Quick, surface-level responses
/effort medium # Balanced (default for most modes)
/effort high # Thorough analysis
/effort max # Maximum reasoning depth (Opus only)
/effort auto # Reset to the model's default effort level
When to change effort:
| Task | Recommended effort |
|---|---|
| Quick lookups and explanations | low |
| Writing tests, feature implementation | medium |
| Security review, complex debugging | high |
| Architecture design, multi-file reasoning | high or max |
Fast mode
/fast on # Toggle fast mode on
/fast off # Toggle fast mode off
Or press Option+O (macOS) / Alt+O (Linux/Windows) to toggle mid-session.
Fast mode is specific to Opus 4.6 — it makes Opus roughly 2.5x faster at a higher per-token cost. If you are on a different model when you enable fast mode, Claude Code switches you to Opus 4.6 automatically. Use it for iterative work where you are going back and forth rapidly — writing code, running tests, fixing errors. Turn it off when you want to save cost or need Claude to reason more carefully.
Tracking costs
Token usage in this session
/cost
Shows how many tokens you have used, the cost so far, and a breakdown by input vs. output tokens.
Plan limits and rate status
/usage
Shows your current plan’s limits and how much you have used. Useful for knowing if you are approaching rate limits.
Context window usage
/context
Visualizes how full your context window is. A full context window costs more per message because Claude processes the entire context with every response. See Sessions & Context for strategies to keep this manageable.
Budget guardrails
For automated or unattended runs, set hard limits:
# Stop after spending $5
claude -p "refactor the billing module" --max-budget-usd 5
# Stop after 10 agentic turns (prevents runaway loops)
claude -p "fix all test failures" --max-turns 10
Both flags only work in non-interactive (print/batch) mode (claude -p).
They are not available in interactive sessions. They are especially important
for CI/CD pipelines where a bug in the prompt could cause Claude to loop
indefinitely.
Cost-saving patterns
Right-size the model
Do not use Opus for writing boilerplate. Do not use Haiku for architecture review. Match the model to the task.
# Quick: generate a test file (Sonnet is fine)
claude -p "write tests for src/utils.ts" --model sonnet
# Deep: find a subtle concurrency bug (worth Opus)
claude -p "find race conditions in src/services/" --model opus
Compact regularly
A conversation at 80% context costs roughly 4x per message compared to 20%.
Use /compact to keep the window lean.
Clear between tasks
/clear is free and prevents paying to process irrelevant context from a
previous task.
Use effort levels
/effort low for quick lookups saves significant tokens compared to the
default. The response will be shorter and less thorough, but that is often
exactly what you want.
Batch mode for CI
In CI/CD, always set --max-turns and --max-budget-usd to prevent surprise costs:
claude -p "review staged changes" \
--max-turns 5 \
--max-budget-usd 2 \
--output-format json
Tips
- Start most sessions with Sonnet. Upgrade to Opus only when you notice Claude struggling with complexity.
/costis instant and non-disruptive — check it periodically during long sessions.- Effort levels have a bigger impact on cost than most people expect.
/effort lowfor a quick question can be 5-10x cheaper than/effort high. - Fast mode makes Opus 4.6 faster at higher cost. Use it liberally during implementation loops when speed matters more than cost.
--max-budget-usdis your safety net for any automated run. Set it even if you think the task is cheap.- When demonstrating Claude Code to leadership,
/costat the end of a session gives concrete numbers for ROI discussions.