Skip to content

voyage-3-large

voyage/voyage-3-large

voyage-3-large is Voyage AI's general-purpose embedding model with a context window of 0 tokens, Matryoshka dimensionality (2048/1024/512/256), and quantization-aware training. It outperforms OpenAI text-embedding-3-large by 9.74% across 100 retrieval datasets.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-3-large',
value: 'Sunny day at the beach',
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

voyage-3-large offers four embedding dimensions and multiple quantization levels. Start with 1024-dimensional int8 embeddings for most production workloads. This configuration loses only 0.31% quality versus full precision at 2048 dimensions while using 8x less storage. Drop to 512 or 256 dimensions only when storage constraints are severe.

Confirm your vector database supports the embedding dimension and precision format you pick before indexing. Switching dimensions after indexing requires re-embedding your entire corpus.

voyage-3-large outperforms Voyage AI's own domain-specific models on legal and financial tasks. If your retrieval spans multiple domains, use voyage-3-large instead of running separate specialized models.

When to Use voyage-3-large

Best For

  • General-purpose semantic search:

    One embedding model covers technical, legal, financial, and conversational content

  • RAG pipelines:

    Large corpora where the context window of 0 tokens fits longer documents and retrieval chunks without truncation

  • Storage-sensitive deployments:

    Matryoshka dimensionality and quantization reduce vector database costs by up to 200x with binary embeddings

  • Multilingual retrieval:

    Applications that need a single embedding index across languages

  • High-accuracy retrieval:

    The 9.74% improvement over OpenAI text-embedding-3-large translates to measurably better recall in production

Consider Alternatives When

  • Cost is the primary constraint:

    Voyage-3.5-lite offers high retrieval quality at a lower price point when accuracy requirements are moderate

  • Your corpus is exclusively source code:

    Voyage-code-3 is purpose-built for that domain

  • You need multimodal embeddings:

    Cohere Embed v4 supports images, screenshots, and interleaved content natively

  • Your workload is latency-sensitive:

    A lighter model reduces per-query response time at very high query volumes

Conclusion

voyage-3-large works well as a default for teams that need one high-accuracy embedding model across diverse retrieval domains. Its Matryoshka dimensionality and quantization options give you practical cost controls at scale. The context window of 0 tokens handles longer documents without chunking compromises. Route it through AI Gateway for unified API access, usage tracking, and provider flexibility.

FAQ

2048, 1024, 512, and 256. Matryoshka learning ensures lower-dimensional embeddings retain most of the retrieval quality of full 2048-dimensional output. Start with 1024 for most production use cases.

voyage-3-large outperforms OpenAI text-embedding-3-large by 9.74% on average across 100 retrieval datasets spanning eight domains. Binary 512-dimensional voyage-3-large embeddings outperform OpenAI's full 3072-dimensional float embeddings while using 200x less storage.

32-bit float, signed 8-bit integer, unsigned 8-bit integer, and binary precision. Quantization-aware training means these reduced-precision formats lose minimal quality compared to full float precision.

voyage-3-large outperforms Voyage AI's domain-specific models on legal and financial retrieval benchmarks. For most teams, it eliminates the need to manage separate domain-specific embedding endpoints.

0 tokens. This lets you embed longer documents and retrieval chunks without truncation, compared to the shorter context windows of many competing embedding models.

Add your Voyage AI API key in AI Gateway settings, then send embedding requests through AI Gateway using the standard embedding API format. AI Gateway authenticates requests, routes them, and records usage.

No. Switching dimensions requires re-embedding and re-indexing your entire corpus. Choose your target dimension before building your production index. Start with 1024-dimensional int8 embeddings for a good accuracy-to-storage ratio.