What made Claude 3.5 Haiku historically significant within Anthropic's lineup?

It was the first Haiku-class model to reach Opus-level benchmark performance. Before 3.5 Haiku, the fast tier meant a clear capability step-down. This model proved that post-training techniques could close the gap between the smallest and largest models in a generation.

How does the 40.6% SWE-bench Verified score translate to real-world coding work?

SWE-bench Verified tests a model's ability to read natural language bug reports and feature requests, navigate real open-source codebases, and produce working patches. A 40.6% score means the model successfully resolved that proportion of real software engineering tasks drawn from production repositories.

Was the December 2024 pricing change an increase or decrease from launch?

Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.

What kinds of sub-agent tasks is 3.5 Haiku well suited for?

Narrow, well-defined tasks within a larger pipeline: data extraction, format conversion, classification, schema validation, and structured tool invocation. Low latency and strong instruction following make it effective for the high-frequency, low-ambiguity work that sub-agents typically handle.

Does Claude 3.5 Haiku handle multilingual inputs?

The model processes multilingual text, though its primary optimization targets are English-language tasks. For workloads where multilingual instruction following is the central requirement, evaluate against benchmark results for your specific languages.

What is the practical difference between 3.5 Haiku and later Haiku generations for new projects?

Claude 3.5 Haiku is a Claude 3.5 generation model. Later Haiku releases build on newer base architectures with different capability profiles. For new projects, compare the specific benchmarks and feature sets relevant to your workload rather than assuming the newer generation is universally better.

Claude 3.5 Haiku

Claude 3.5 Haiku was the model that first proved a Haiku-tier release could match Opus-class performance, scoring strong results on SWE-bench Verified and redefining expectations for what fast, affordable models could accomplish on real engineering tasks.

File InputTool UseVision (Image)Explicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'anthropic/claude-3.5-haiku',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out Claude 3.5 Haiku by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

200K

0.5s

51tps

$0.80/M

$4.00/M

Read:$0.08/M

Write:

$1/M

—

11/06/2023

Legal:Terms

•

Privacy

200K

0.5s

54tps

$0.80/M

$4.00/M

Read:$0.08/M

Write:

$1/M

$10.00/K

+ input costs

—

11/06/2023

More models by Anthropic

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

0.6s

95tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10/K

+ input costs

—

04/16/2026

0.6s

54tps

$3.00/M

$15.00/M

Read:$0.3/M

Write:

$3.75/M

$10/K

+ input costs

—

02/17/2026

0.9s

54tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10/K

+ input costs

—

02/05/2026

200K

0.4s

111tps

$1.00/M

$5.00/M

Read:$0.1/M

Write:

$1.25/M

$10.00/K

+ input costs

—

10/15/2025

0.7s

57tps

$3.00/M

$15.00/M

Read:

$0.3/M

Write:

$3.75/M

$10.00/K

+ input costs

—

09/29/2025

200K

0.6s

48tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10.00/K

+ input costs

—

11/24/2024

About Claude 3.5 Haiku

Before Claude 3.5 Haiku arrived on November 6, 2023, the Haiku tier was the budget option: fast and cheap, but clearly a step down in capability. Claude 3.5 Haiku broke that assumption. Anthropic announced it matched Claude 3 Opus on many intelligence evaluations while running at speeds comparable to the original Claude 3 Haiku. The result: Opus-class reasoning at Haiku-class latency.

The SWE-bench Verified result crystallized this. At 40.6%, Claude 3.5 Haiku outperformed agents built on the original Claude 3.5 Sonnet and other comparable models at the time on real-world software engineering tasks: fixing bugs, implementing features, and navigating codebases. For a model designed for speed and low cost, posting that score on a benchmark requiring sustained multi-file reasoning was a landmark.

Anthropic designed the model for three deployment patterns: latency-sensitive user-facing products, specialized sub-agent roles inside agentic pipelines, and personalization engines processing large volumes of per-user data (purchase histories, pricing records, inventory feeds). Improved instruction following and more accurate tool use make it practical for the structured, schema-bound work that sub-agents handle in production systems.

The December 2024 pricing revision adjusted rates to $0.8 per million input tokens and $4.0 per million output tokens. If you modeled costs around launch pricing, verify against current rates.

What To Consider When Choosing a Provider

Configuration: For sub-agent architectures where Haiku handles a high volume of narrow tasks, AI Gateway's per-request cost tracking helps attribute spend to individual workflow branches.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Claude 3.5 Haiku

Best For

Coding assistance at scale: The 40.6% SWE-bench score means real engineering capability without Sonnet-tier cost
Sub-agent pipelines: Need a model fast enough for high-frequency subtask execution with strong instruction compliance
Latency-critical interfaces: Autocomplete, inline suggestions, live enrichment, where every millisecond of response time affects user experience
Data-heavy personalization: Processes per-user records and generates tailored outputs in real time
High-volume tool calling: Accurate function invocation at speed drives automated workflows

Consider Alternatives When

Computer use capability: Feature shipped with Claude 3.5 Sonnet, not Haiku
Extended or adaptive thinking: Modes that benefit harder reasoning tasks are unavailable on this model
Vision-heavy workloads: Sonnet or Opus variants score higher on image-centric benchmarks

Conclusion

Claude 3.5 Haiku raised the capability ceiling for the fast tier of Anthropic's model family. Opus-level intelligence running at Haiku speeds opens coding, agentic, and personalization workloads that previously demanded a larger, slower, more expensive model. The release remains relevant for any team optimizing the cost-capability tradeoff.

Frequently Asked Questions

What made Claude 3.5 Haiku historically significant within Anthropic's lineup?
It was the first Haiku-class model to reach Opus-level benchmark performance. Before 3.5 Haiku, the fast tier meant a clear capability step-down. This model proved that post-training techniques could close the gap between the smallest and largest models in a generation.
How does the 40.6% SWE-bench Verified score translate to real-world coding work?
SWE-bench Verified tests a model's ability to read natural language bug reports and feature requests, navigate real open-source codebases, and produce working patches. A 40.6% score means the model successfully resolved that proportion of real software engineering tasks drawn from production repositories.
Was the December 2024 pricing change an increase or decrease from launch?
Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.
What kinds of sub-agent tasks is 3.5 Haiku well suited for?
Narrow, well-defined tasks within a larger pipeline: data extraction, format conversion, classification, schema validation, and structured tool invocation. Low latency and strong instruction following make it effective for the high-frequency, low-ambiguity work that sub-agents typically handle.
Does Claude 3.5 Haiku handle multilingual inputs?
The model processes multilingual text, though its primary optimization targets are English-language tasks. For workloads where multilingual instruction following is the central requirement, evaluate against benchmark results for your specific languages.
What is the practical difference between 3.5 Haiku and later Haiku generations for new projects?
Claude 3.5 Haiku is a Claude 3.5 generation model. Later Haiku releases build on newer base architectures with different capability profiles. For new projects, compare the specific benchmarks and feature sets relevant to your workload rather than assuming the newer generation is universally better.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Claude 3.5 Haiku

Playground

Providers

More models by Anthropic

About Claude 3.5 Haiku

What To Consider When Choosing a Provider

When to Use Claude 3.5 Haiku

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions