Skip to content
Dashboard

voyage-3-large

voyage-3-large is Voyage AI's general-purpose embedding model with a context window of 0 tokens, Matryoshka dimensionality (2048/1024/512/256), and quantization-aware training. It outperforms OpenAI text-embedding-3-large by 9.74% across 100 retrieval datasets.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-3-large',
value: 'Sunny day at the beach',
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Voyage AI
$0.18/M——
01/07/2025

More models by Voyage AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$0.06/M——
voyage logo
01/15/2026
32K
$0.02/M——
voyage logo
01/15/2026
32K
$0.12/M——
voyage logo
01/15/2026
32K
$0.05/M——
voyage logo
08/11/2025
32K
$0.02/M——
voyage logo
08/11/2025
$0.06/M——
voyage logo
05/20/2025

About voyage-3-large

voyage-3-large is Voyage AI's general-purpose embedding model, released January 7, 2025. It supports a context window of 0 tokens and produces embeddings in four dimensions: 2048, 1024, 512, and 256 through Matryoshka learning. You can tune the tradeoff between retrieval accuracy and vector storage cost without retraining or running multiple models. Across 100 datasets spanning eight domains, voyage-3-large outperforms OpenAI text-embedding-3-large by 9.74% and Cohere Embed v3 English by 20.71%.

Quantization-aware training enables multiple precision formats: 32-bit float, signed and unsigned 8-bit integer, and binary. Binary 512-dimensional embeddings outperform OpenAI text-embedding-3-large at full 3072-dimensional float precision while requiring 200x less storage. Int8 precision at 1024 dimensions loses only 0.31% quality versus full-precision 2048-dimensional output, cutting storage by 8x. These options make voyage-3-large practical for large-scale production indices.

Voyage AI evaluates voyage-3-large across technical documentation, code, legal, financial, web, multilingual, long-document, and conversational domains. It outperforms Voyage AI's own domain-specific models on legal and financial retrieval tasks. That makes it a practical single-model choice when you need broad domain coverage without managing multiple specialized embedding endpoints.

What To Consider When Choosing a Provider

  • Configuration: voyage-3-large offers four embedding dimensions and multiple quantization levels. Start with 1024-dimensional int8 embeddings for most production workloads. This configuration loses only 0.31% quality versus full precision at 2048 dimensions while using 8x less storage. Drop to 512 or 256 dimensions only when storage constraints are severe.
  • Configuration: Confirm your vector database supports the embedding dimension and precision format you pick before indexing. Switching dimensions after indexing requires re-embedding your entire corpus.
  • Configuration: voyage-3-large outperforms Voyage AI's own domain-specific models on legal and financial tasks. If your retrieval spans multiple domains, use voyage-3-large instead of running separate specialized models.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use voyage-3-large

Best For

  • General-purpose semantic search: One embedding model covers technical, legal, financial, and conversational content
  • RAG pipelines: Large corpora where the context window of 0 tokens fits longer documents and retrieval chunks without truncation
  • Storage-sensitive deployments: Matryoshka dimensionality and quantization reduce vector database costs by up to 200x with binary embeddings
  • Multilingual retrieval: Applications that need a single embedding index across languages
  • High-accuracy retrieval: The 9.74% improvement over OpenAI text-embedding-3-large translates to measurably better recall in production

Consider Alternatives When

  • Cost is the primary constraint: Voyage-3.5-lite offers high retrieval quality at a lower price point when accuracy requirements are moderate
  • Your corpus is exclusively source code: Voyage-code-3 is purpose-built for that domain
  • You need multimodal embeddings: Cohere Embed v4 supports images, screenshots, and interleaved content natively
  • Your workload is latency-sensitive: A lighter model reduces per-query response time at very high query volumes

Conclusion

voyage-3-large works well as a default for teams that need one high-accuracy embedding model across diverse retrieval domains. Its Matryoshka dimensionality and quantization options give you practical cost controls at scale. The context window of 0 tokens handles longer documents without chunking compromises. Route it through AI Gateway for unified API access, usage tracking, and provider flexibility.