Claude 3.7 Sonnet
Claude 3.7 Sonnet is the first hybrid reasoning model, a single model that delivers near-instant responses or extended step-by-step thinking with API-controllable thinking token budgets up to 128K tokens, scoring 63.7% on SWE-bench Verified and showing strong gains in coding and front-end web development.
import { streamText } from 'ai'
const result = streamText({ model: 'anthropic/claude-3.7-sonnet', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
Extended thinking mode is available on all Claude plans except the free tier. Confirm your plan supports it if you intend to use thinking token budgets in production.
When to Use Claude 3.7 Sonnet
Best For
Hard reasoning problems where you can trade latency for quality:
Extended thinking with a defined token budget lets you control the depth-speed tradeoff at the API level
Complex full-stack and multi-file coding tasks:
Cited by Cursor, Cognition, Replit, and Canva as strong
Front-end web development:
Called out as a strength area, with Canva reporting consistently production-ready code output and superior design taste
Agentic workflows requiring precise tool use:
Anthropic noted strong precision for complex agent execution
Tasks where you want a single model for both fast and deep responses:
Rather than maintaining two separate integrations
Consider Alternatives When
Fastest response without thinking overhead:
Claude 3.5 Haiku or a later Haiku variant suits high-throughput simple tasks better
Computer use and agent orchestration:
Later models improved substantially on capabilities for real software interfaces
1M token context window:
Claude Sonnet 4 and later Opus models introduced the 1M-token window for workloads that exceed this model's context
Cost-dominated workloads:
Tasks that don't benefit from extended thinking belong on cheaper models
Conclusion
Claude 3.7 Sonnet defined the hybrid reasoning category and remains a strong choice for workflows where switching between fast responses and deep multi-step reasoning within the same model is operationally valuable. Its coding and front-end web development capabilities set standards that later models built upon.
FAQ
Extended thinking triggers the model to reason step-by-step before producing its final response, with the reasoning process visible to the caller. You control the thinking token budget at the API level. Set it to any value from a small number up to 128K tokens (Anthropic's documented cap for that budget at launch), trading cost and latency for answer quality.
This page lists the current rates. Multiple providers can serve Claude 3.7 Sonnet, so AI Gateway surfaces live pricing rather than a single fixed figure.
Anthropic's philosophy is that reasoning should work like human reflection: the same system for both quick answers and deep analysis. Prompting patterns transfer between modes, so you don't need to maintain two separate model integrations.
Extended thinking provides a notable boost on math and physics problems specifically, on top of baseline improvements in those areas from the standard model.
It scored 63.7% on SWE-bench Verified at the time of the January 25, 2024 launch (70.3% with enhanced compute), using minimal scaffolding with just a bash tool and file editing tool.
The thinking mode improves performance across coding, math, physics, instruction following, and other task types. Anthropic deliberately focused on real-world tasks rather than competition-style math problems.
Claude Code launched as a limited research preview alongside Claude 3.7 Sonnet. It's a command-line agentic coding tool that searches and reads code, edits files, writes and runs tests, and commits to GitHub. It became generally available with Claude Opus 4.