What is the difference between Voyage 4 Large and voyage-4?

Voyage 4 Large is the MoE flagship with the highest average scores in Voyage AI's published Voyage 4 comparison. `voyage-4` is the mid-sized model. Both share the same embedding space as `voyage-4-lite`.

How does Voyage 4 Large compare to voyage-3-large?

Voyage AI reports better retrieval accuracy than voyage-3-large at a lower price, using MoE and the Voyage 4 training stack. Moving from Voyage 3 to Voyage 4 requires re-embedding because the embedding space changes.

What is the context window for Voyage 4 Large?

32K tokens. Size chunks so single requests stay under this limit on long documents.

When should I use Voyage 4 Large over voyage-4-lite?

Use Voyage 4 Large when you need the strongest published Voyage 4 vectors, especially for one-time or infrequent document embedding. Use `voyage-4-lite` when you want fewer parameters for queries or symmetric indexing at lower compute.

How do I access Voyage 4 Large through Vercel AI Gateway?

Add your Voyage AI API key in AI Gateway settings, then send embedding requests through AI Gateway. AI Gateway authenticates requests and records usage.

Do I need to re-embed my data to switch from voyage-3-large?

Yes. Moving from Voyage 3 to Voyage 4 requires re-embedding because the embedding space is new. Within Voyage 4, you can often keep `voyage-4-large` document vectors and change query models if you use asymmetric retrieval.

Is Voyage 4 Large suitable for RAG applications?

Yes. Voyage AI positions it for retrieval-augmented generation and high-accuracy document indexing, including asymmetric setups where queries use a smaller Voyage 4 model.

What is mixture-of-experts in Voyage 4 Large?

Voyage 4 Large routes tokens through expert subnetworks so Voyage AI can raise accuracy while reporting serving costs about 40% lower than comparable dense models.

Voyage 4 Large

Voyage 4 Large is Voyage AI's Voyage 4 flagship embedding model. It uses a mixture-of-experts (MoE) architecture. Voyage AI reports state-of-the-art general retrieval in their published benchmarks, with serving costs about 40% lower than comparable dense models, and average gains over OpenAI text-embedding-3-large, Cohere Embed v4, and Gemini Embedding 001 in the same comparison. It shares one embedding space with voyage-4 and voyage-4-lite.

import { embed } from 'ai';

const result = await embed({
  model: 'voyage/voyage-4-large',
  value: 'Sunny day at the beach',
})

About Voyage 4 Large

Voyage 4 Large is the first production embedding model to use a mixture-of-experts architecture, released N/A. MoE activates only a subset of parameters per token, achieving flagship-level retrieval accuracy at lower inference cost than a dense model of equivalent quality.

Voyage AI reports Voyage 4 Large surpasses voyage-3-large on retrieval accuracy at a lower price point, with serving costs about 40% below comparable dense models. It supports a context window of 32K tokens and the full Matryoshka dimension set (2048, 1024, 512, 256) with quantization-aware training.

As the top of the Voyage 4 series, Voyage 4 Large produces the strongest average retrieval scores in Voyage AI's published benchmarks. Use it for document embeddings in asymmetric setups where you pair it with voyage-4 or voyage-4-lite on the query side to control per-query costs.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

32K

$0.12/M

—

Metrics

Based exclusively on usage through AI Gateway.

Throughput24 hours

More models by Voyage AI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

$0.02/M

—

05/20/2025

$0.18/M

—

09/01/2024

$0.18/M

—

09/01/2024

$0.12/M

—

03/01/2024

32K

$0.02/M

—

32K

$0.06/M

—

What To Consider When Choosing a Provider

Configuration: Voyage 4 Large targets teams that need top published scores and can pay for the flagship on the paths that matter (often document embedding).
Configuration: Pair Voyage 4 Large document vectors with smaller Voyage 4 query models when query volume is high.
Configuration: Treat a move to Voyage 4 as a new index. Test on a sample corpus before you re-embed everything.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Voyage 4 Large

Best For

Corpus embedding once: You want maximum document-side quality in Voyage AI's published Voyage 4 results
Asymmetric RAG: Documents use Voyage 4 Large and queries use voyage-4-lite
Enterprise search: Long documents within the window of 32K tokens
Upgrades from voyage-3-large: You accept a full re-embed for Voyage 4's shared space and MoE gains

Consider Alternatives When

Lower per-query cost: Use voyage-4 or voyage-4-lite for queries, or symmetric indexing with a smaller Voyage 4 model
Mid-tier symmetric use: voyage-4 when you want one model for both sides
Code-only corpora: Use voyage-code-3 for repositories where source code is the primary content type
Multimodal embeddings: Pick a model with native image inputs

Conclusion

MoE architecture gives Voyage 4 Large flagship retrieval accuracy at lower serving costs than dense alternatives. Use it for document embeddings and pair with lighter Voyage 4 models on queries to optimize per-request spend through AI Gateway.

Frequently Asked Questions

What is the difference between Voyage 4 Large and voyage-4?
Voyage 4 Large is the MoE flagship with the highest average scores in Voyage AI's published Voyage 4 comparison. voyage-4 is the mid-sized model. Both share the same embedding space as voyage-4-lite.
How does Voyage 4 Large compare to voyage-3-large?
Voyage AI reports better retrieval accuracy than voyage-3-large at a lower price, using MoE and the Voyage 4 training stack. Moving from Voyage 3 to Voyage 4 requires re-embedding because the embedding space changes.
What is the context window for Voyage 4 Large?
32K tokens. Size chunks so single requests stay under this limit on long documents.
When should I use Voyage 4 Large over voyage-4-lite?
Use Voyage 4 Large when you need the strongest published Voyage 4 vectors, especially for one-time or infrequent document embedding. Use voyage-4-lite when you want fewer parameters for queries or symmetric indexing at lower compute.
How do I access Voyage 4 Large through Vercel AI Gateway?
Add your Voyage AI API key in AI Gateway settings, then send embedding requests through AI Gateway. AI Gateway authenticates requests and records usage.
Do I need to re-embed my data to switch from voyage-3-large?
Yes. Moving from Voyage 3 to Voyage 4 requires re-embedding because the embedding space is new. Within Voyage 4, you can often keep voyage-4-large document vectors and change query models if you use asymmetric retrieval.
Is Voyage 4 Large suitable for RAG applications?
Yes. Voyage AI positions it for retrieval-augmented generation and high-accuracy document indexing, including asymmetric setups where queries use a smaller Voyage 4 model.
What is mixture-of-experts in Voyage 4 Large?
Voyage 4 Large routes tokens through expert subnetworks so Voyage AI can raise accuracy while reporting serving costs about 40% lower than comparable dense models.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Voyage 4 Large

About Voyage 4 Large

Providers

More models by Voyage AI

What To Consider When Choosing a Provider

When to Use Voyage 4 Large

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions

About Voyage 4 Large

Providers

More models by Voyage AI