voyage-3-large
voyage-3-large is Voyage AI's general-purpose embedding model with a context window of 0 tokens, Matryoshka dimensionality (2048/1024/512/256), and quantization-aware training. It outperforms OpenAI text-embedding-3-large by 9.74% across 100 retrieval datasets.
import { embed } from 'ai';
const result = await embed({ model: 'voyage/voyage-3-large', value: 'Sunny day at the beach',})Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
More models by Voyage AI
| Model |
|---|
About voyage-3-large
voyage-3-large is Voyage AI's general-purpose embedding model, released January 7, 2025. It supports a context window of 0 tokens and produces embeddings in four dimensions: 2048, 1024, 512, and 256 through Matryoshka learning. You can tune the tradeoff between retrieval accuracy and vector storage cost without retraining or running multiple models. Across 100 datasets spanning eight domains, voyage-3-large outperforms OpenAI text-embedding-3-large by 9.74% and Cohere Embed v3 English by 20.71%.
Quantization-aware training enables multiple precision formats: 32-bit float, signed and unsigned 8-bit integer, and binary. Binary 512-dimensional embeddings outperform OpenAI text-embedding-3-large at full 3072-dimensional float precision while requiring 200x less storage. Int8 precision at 1024 dimensions loses only 0.31% quality versus full-precision 2048-dimensional output, cutting storage by 8x. These options make voyage-3-large practical for large-scale production indices.
Voyage AI evaluates voyage-3-large across technical documentation, code, legal, financial, web, multilingual, long-document, and conversational domains. It outperforms Voyage AI's own domain-specific models on legal and financial retrieval tasks. That makes it a practical single-model choice when you need broad domain coverage without managing multiple specialized embedding endpoints.
What To Consider When Choosing a Provider
- Configuration: voyage-3-large offers four embedding dimensions and multiple quantization levels. Start with 1024-dimensional
int8embeddings for most production workloads. This configuration loses only 0.31% quality versus full precision at 2048 dimensions while using 8x less storage. Drop to 512 or 256 dimensions only when storage constraints are severe. - Configuration: Confirm your vector database supports the embedding dimension and precision format you pick before indexing. Switching dimensions after indexing requires re-embedding your entire corpus.
- Configuration: voyage-3-large outperforms Voyage AI's own domain-specific models on legal and financial tasks. If your retrieval spans multiple domains, use voyage-3-large instead of running separate specialized models.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use voyage-3-large
Best For
- General-purpose semantic search: One embedding model covers technical, legal, financial, and conversational content
- RAG pipelines: Large corpora where the context window of 0 tokens fits longer documents and retrieval chunks without truncation
- Storage-sensitive deployments: Matryoshka dimensionality and quantization reduce vector database costs by up to 200x with binary embeddings
- Multilingual retrieval: Applications that need a single embedding index across languages
- High-accuracy retrieval: The 9.74% improvement over OpenAI text-embedding-3-large translates to measurably better recall in production
Consider Alternatives When
- Cost is the primary constraint: Voyage-3.5-lite offers high retrieval quality at a lower price point when accuracy requirements are moderate
- Your corpus is exclusively source code: Voyage-code-3 is purpose-built for that domain
- You need multimodal embeddings: Cohere Embed v4 supports images, screenshots, and interleaved content natively
- Your workload is latency-sensitive: A lighter model reduces per-query response time at very high query volumes
Conclusion
voyage-3-large works well as a default for teams that need one high-accuracy embedding model across diverse retrieval domains. Its Matryoshka dimensionality and quantization options give you practical cost controls at scale. The context window of 0 tokens handles longer documents without chunking compromises. Route it through AI Gateway for unified API access, usage tracking, and provider flexibility.