Skip to content
Dashboard

Claude 3.5 Haiku

Claude 3.5 Haiku was the model that first proved a Haiku-tier release could match Opus-class performance, scoring strong results on SWE-bench Verified and redefining expectations for what fast, affordable models could accomplish on real engineering tasks.

File InputTool UseVision (Image)Explicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'anthropic/claude-3.5-haiku',
prompt: 'Why is the sky blue?'
})

Playground

Try out Claude 3.5 Haiku by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

anthropic logo
anthropic logo

Ask Claude 3.5 Haiku anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Amazon Bedrock
200K
$0.80/M$4.00/M
Read:$0.08/M
Write:
$1/M
——
+2
11/04/2024
Google Vertex AI
200K
0.5s
54tps
$0.80/M$4.00/M
Read:$0.08/M
Write:
$1/M
$10.00/K
+ input costs
—
+2
11/04/2024
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Anthropic

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.8s
110tps
$3.00/M$2.00/M
$15.00/M$10.00/M
Read:$0.3/M$0.2/M
Write:
$3.75/M$2.5/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
06/29/2026
1M
2.4s
113tps
$5.00/MFast $10.00/M
$25.00/MFast $50.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
05/28/2026
1M
0.7s
92tps
$5.00/MFast $30.00/M
$25.00/MFast $150.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
04/16/2026
1M
0.9s
75tps
$3.00/M$15.00/M
Read:$0.3/M
Write:
$3.75/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/17/2026
1M
1.3s
55tps
$5.00/MFast $30.00/M
$25.00/MFast $150.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/05/2026
200K
0.9s
137tps
$1.00/M$5.00/M
Read:$0.1/M
Write:
$1.25/M
$10.00/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
10/15/2025

About Claude 3.5 Haiku

Before Claude 3.5 Haiku arrived on November 4, 2024, the Haiku tier was the budget option: fast and cheap, but a step down in capability. Claude 3.5 Haiku broke that assumption. Anthropic announced it matched Claude 3 Opus on many intelligence evaluations while running at speeds comparable to the original Claude 3 Haiku. The result: Opus-class reasoning at Haiku-class latency.

The SWE-bench Verified result crystallized this. At 40.6%, Claude 3.5 Haiku outperformed agents built on the original Claude 3.5 Sonnet and other comparable models at the time on real-world software engineering tasks: fixing bugs, implementing features, and navigating codebases. For a model designed for speed and low cost, posting that score on a benchmark requiring sustained multi-file reasoning was a landmark.

Anthropic designed Claude 3.5 Haiku for three deployment patterns: latency-sensitive user-facing products, specialized sub-agent roles inside agentic pipelines, and personalization engines processing large volumes of per-user data such as purchase histories, pricing records, and inventory feeds. Improved instruction following and more accurate tool use make it practical for the structured, schema-bound work that sub-agents handle in production systems.

Live rates are shown in the pricing panel on this page.

What To Consider When Choosing a Provider

  • Configuration: For sub-agent architectures where Claude 3.5 Haiku handles a high volume of narrow tasks, AI Gateway per-request cost tracking helps attribute spend to individual workflow branches.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Claude 3.5 Haiku

Best For

  • Coding assistance at scale: The 40.6% SWE-bench score brings real engineering capability without Sonnet-tier cost
  • Sub-agent pipelines: Fast enough for high-frequency subtask execution with strong instruction compliance
  • Latency-critical interfaces: Autocomplete, inline suggestions, and live enrichment where response time affects user experience
  • Data-heavy personalization: Processes per-user records and generates tailored outputs in real time
  • High-volume tool calling: Accurate function invocation at speed drives automated workflows

Consider Alternatives When

  • Computer use capability: That feature shipped with Claude 3.5 Sonnet, not Haiku
  • Extended or adaptive thinking: Modes for harder reasoning tasks are unavailable on this model
  • Vision-heavy workloads: Sonnet or Opus variants score higher on image-centric benchmarks

Conclusion

Claude 3.5 Haiku raised the capability ceiling for the fast tier of Anthropic's model family. Opus-level intelligence running at Haiku speeds opens coding, agentic, and personalization workloads that previously required a larger, slower, more expensive model. The release stays relevant for any team optimizing the cost-capability tradeoff.