Claude 3.5 Sonnet (2024-06-20)
Claude 3.5 Sonnet (2024-06-20) (June 2024) delivered strong intelligence at twice the speed of Claude 3 Opus, setting new industry benchmarks on graduate-level reasoning and coding proficiency while matching Opus-tier capability at Sonnet pricing, the first release in Anthropic's Claude 3.5 model family.
import { streamText } from 'ai'
const result = streamText({ model: 'anthropic/claude-3.5-sonnet-20240620', prompt: 'Why is the sky blue?'})Frequently Asked Questions
Why would I use this
-20240620checkpoint instead ofanthropic/claude-3.5-sonnet(October 2024)?Production systems validated against the June 2024 behavior sometimes need a pinned checkpoint to avoid unexpected behavioral drift. Regression test suites, evaluation harnesses, or contracts specifying a particular model version are common reasons.
How did Claude 3.5 Sonnet (2024-06-20) (June 2024) compare to Claude 3 Opus on benchmarks?
Anthropic reported it outperformed Claude 3 Opus on GPQA (graduate-level reasoning), MMLU (undergraduate knowledge), and HumanEval (coding proficiency), operating at twice Opus's speed and at Sonnet-tier cost.
What was the internal agentic coding evaluation result?
3.5 Sonnet solved 64% of coding problems in Anthropic's internal evaluation, fixing bugs and adding features to open-source codebases given natural language descriptions. Claude 3 Opus scored 38%.
Does this version have computer use?
No. Computer use was introduced in the October 2024 upgrade (the non-dated claude-3.5-sonnet). This June 2024 checkpoint predates that capability.
What vision improvements did this model introduce over Claude 3?
Anthropic described it as outperforming Claude 3 Opus on standard vision benchmarks at the time, with particularly strong gains in chart and graph interpretation and in transcribing text from imperfect images.
What does the context window of 200K tokens enable for this model?
You can process entire large documents, codebases, or conversation histories in a single pass. Anthropic highlighted context-sensitive customer support across long histories and multi-file code analysis as key use cases.