Skip to content

Claude 3.5 Haiku

anthropic/claude-3.5-haiku

Claude 3.5 Haiku was the model that first proved a Haiku-tier release could match Opus-class performance, scoring strong results on SWE-bench Verified and redefining expectations for what fast, affordable models could accomplish on real engineering tasks.

File InputTool UseVision (Image)Explicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'anthropic/claude-3.5-haiku',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

For sub-agent architectures where Haiku handles a high volume of narrow tasks, AI Gateway's per-request cost tracking helps attribute spend to individual workflow branches.

When to Use Claude 3.5 Haiku

Best For

  • Coding assistance at scale:

    The 40.6% SWE-bench score means real engineering capability without Sonnet-tier cost

  • Sub-agent pipelines:

    Need a model fast enough for high-frequency subtask execution with strong instruction compliance

  • Latency-critical interfaces:

    Autocomplete, inline suggestions, live enrichment, where every millisecond of response time affects user experience

  • Data-heavy personalization:

    Processes per-user records and generates tailored outputs in real time

  • High-volume tool calling:

    Accurate function invocation at speed drives automated workflows

Consider Alternatives When

  • Computer use capability:

    Feature shipped with Claude 3.5 Sonnet, not Haiku

  • Extended or adaptive thinking:

    Modes that benefit harder reasoning tasks are unavailable on this model

  • Vision-heavy workloads:

    Sonnet or Opus variants score higher on image-centric benchmarks

Conclusion

Claude 3.5 Haiku raised the capability ceiling for the fast tier of Anthropic's model family. Opus-level intelligence running at Haiku speeds opens coding, agentic, and personalization workloads that previously demanded a larger, slower, more expensive model. The release remains relevant for any team optimizing the cost-capability tradeoff.

FAQ

It was the first Haiku-class model to reach Opus-level benchmark performance. Before 3.5 Haiku, the fast tier meant a clear capability step-down. This model proved that post-training techniques could close the gap between the smallest and largest models in a generation.

SWE-bench Verified tests a model's ability to read natural language bug reports and feature requests, navigate real open-source codebases, and produce working patches. A 40.6% score means the model successfully resolved that proportion of real software engineering tasks drawn from production repositories.

Rates are listed on this page. They reflect the providers routing through AI Gateway and shift when providers update their pricing.

Narrow, well-defined tasks within a larger pipeline: data extraction, format conversion, classification, schema validation, and structured tool invocation. Low latency and strong instruction following make it effective for the high-frequency, low-ambiguity work that sub-agents typically handle.

The model processes multilingual text, though its primary optimization targets are English-language tasks. For workloads where multilingual instruction following is the central requirement, evaluate against benchmark results for your specific languages.

Claude 3.5 Haiku is a Claude 3.5 generation model. Later Haiku releases build on newer base architectures with different capability profiles. For new projects, compare the specific benchmarks and feature sets relevant to your workload rather than assuming the newer generation is universally better.