Skip to content
Dashboard

Claude 3 Haiku

Claude 3 Haiku handles enterprise document workloads at a fraction of Opus-tier cost, serving as the speed-and-affordability anchor of the Claude 3 family.

Tool UseVision (Image)Explicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'anthropic/claude-3-haiku',
prompt: 'Why is the sky blue?'
})

Playground

Try out Claude 3 Haiku by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

anthropic logo
anthropic logo

Ask Claude 3 Haiku anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Amazon Bedrock
200K
0.3s
88tps
$0.25/M$1.25/M
Read:$0.03/M
Write:
$0.3/M
——
+1
03/13/2024
Google Vertex AI
200K
0.4s
157tps
$0.25/M$1.25/M
Read:$0.03/M
Write:
$0.3/M
——
+1
03/13/2024
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Anthropic

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.7s
112tps
$3.00/M$2.00/M
$15.00/M$10.00/M
Read:$0.3/M$0.2/M
Write:
$3.75/M$2.5/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
06/29/2026
1M
2.4s
113tps
$5.00/MFast $10.00/M
$25.00/MFast $50.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
05/28/2026
1M
0.8s
107tps
$5.00/MFast $30.00/M
$25.00/MFast $150.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
04/16/2026
1M
0.9s
75tps
$3.00/M$15.00/M
Read:$0.3/M
Write:
$3.75/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/17/2026
1M
1.3s
55tps
$5.00/MFast $30.00/M
$25.00/MFast $150.00/M
Read:$0.5/M
Write:
$6.25/M
$10/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
02/05/2026
200K
0.5s
140tps
$1.00/M$5.00/M
Read:$0.1/M
Write:
$1.25/M
$10.00/K
+ input costs
—
+4
anthropic logo
bedrock logo
vertexAnthropic logo
10/15/2025

About Claude 3 Haiku

Claude 3 Haiku launched on March 13, 2024 as the fastest, lowest-cost model in the Claude 3 family. Anthropic positioned it alongside Sonnet (mid-tier) and Opus (highest capability). Claude 3 Haiku targets workloads where throughput and cost per token outweigh peak reasoning depth.

Speed defined the release. Anthropic described Claude 3 Haiku as substantially faster than peer models in its performance tier for the majority of workloads, with throughput dropping on prompts that exceed 32K tokens. See live throughput on this page for current numbers.

Anthropic noted a 1:5 input-to-output token ratio at launch.

Claude 3 Haiku ships with the same vision architecture as Sonnet and Opus. It accepts photos, charts, graphs, and technical diagrams as input. Anthropic highlighted document analysis (quarterly filings, contracts, legal cases), responsive customer support chat, and large-dataset annotation as primary use cases. Claude 3 Haiku launched on the Claude API and Amazon Bedrock, with Google Cloud Vertex AI following shortly after.

What To Consider When Choosing a Provider

  • Configuration: Claude 3 Haiku keeps high-volume annotation and chat pipelines economical. Use AI Gateway per-request cost tracking to compare actual spend against Sonnet or newer Haiku generations. See live rates in the pricing panel on this page.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Claude 3 Haiku

Best For

  • High-volume document analysis: Processing contracts, filings, and legal cases where throughput drives pipeline feasibility
  • Customer support chat: Requiring fast response times and consistent instruction following across long conversation histories
  • Image and chart annotation at scale: Using shared Claude 3 vision capabilities at the lowest per-image cost in the family
  • Data extraction and labeling pipelines: Cost per token is the binding constraint and mid-tier reasoning suffices
  • Enterprise content moderation: Screening large queues of text and images with fast turnaround

Consider Alternatives When

  • Deep multi-step reasoning: Sonnet or Opus handles complex code generation and deep reasoning better
  • Extended thinking or computer use: Those capabilities arrived in later model generations
  • Long-context prompts: Haiku's throughput drops on prompts exceeding 32K tokens
  • Claude 3.5-era improvements: Claude 3.5 Haiku matches Opus-level benchmarks at comparable speed

Conclusion

Claude 3 Haiku occupies the speed-and-cost floor of the Claude 3 generation. It remains a practical choice for teams running high-volume, latency-sensitive pipelines where the task complexity fits within a fast-tier model's capability range. Later Haiku generations raised the capability ceiling significantly, but the original Claude 3 Haiku still serves workloads optimized for raw throughput.