Voyage 4 Lite
Voyage 4 Lite is the lightweight Voyage 4 model. Voyage AI reports it approaches voyage-3.5 retrieval accuracy with fewer parameters, shares one embedding space with voyage-4-large and voyage-4, and supports a context window of 32K tokens with Matryoshka dimensions and quantization like the rest of the family.
import { embed } from 'ai';
const result = await embed({ model: 'voyage/voyage-4-lite', value: 'Sunny day at the beach',})Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
More models by Voyage AI
| Model |
|---|
About Voyage 4 Lite
Voyage 4 Lite strips down the Voyage 4 architecture to fewer parameters, released January 15, 2026. The result is a model that processes tokens faster and cheaper than its siblings while retaining enough retrieval quality for most production use cases.
Voyage AI benchmarks Voyage 4 Lite near voyage-3.5 retrieval accuracy. For teams running millions of daily requests or indexing large corpora on a budget, the per-token savings add up fast. Development and staging environments also benefit: cheaper iteration cycles let you experiment with chunking strategies and retrieval pipelines without burning through credits.
Because all Voyage 4 models produce compatible vectors, you aren't locked into Voyage 4 Lite for every step of your pipeline. Index your corpus with a stronger variant, then point live traffic at Voyage 4 Lite for lower query costs. No re-indexing required.
What To Consider When Choosing a Provider
- Configuration: Use Voyage 4 Lite for queries when
voyage-4-largealready holds your document vectors, or as a budget option when both sides use the same model and your accuracy targets match Voyage AI's voyage-3.5 positioning. Plan a full re-embed when moving into Voyage 4, and test on a sample before indexing the full corpus. - Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Voyage 4 Lite
Best For
- High query traffic: Pair Voyage 4 Lite queries with
voyage-4-largedocument embeddings to keep per-query cost low without re-indexing - Cost-sensitive symmetric indexing: Voyage 4 Lite on both sides when voyage-3.5-level retrieval accuracy is sufficient and per-token cost drives the decision
- Early production and prototypes: Iterate cheaply before upgrading query-side models once traffic patterns stabilize
- Batch jobs: Large-corpus indexing runs where per-token cost compounds across millions of requests
Consider Alternatives When
- Higher published average scores: Use
voyage-4-largeorvoyage-4when retrieval accuracy matters more than per-token cost - Code-only corpora: Use
voyage-code-3for repositories where source code is the primary content type - Multimodal embeddings: Use a model with native image inputs when you need to embed diagrams, screenshots, or mixed-format documents
Conclusion
Pick Voyage 4 Lite when your embedding bill scales with request volume and you need Voyage 4 generation quality at the tightest possible price point. Route through AI Gateway to swap between Voyage 4 tiers without changing your integration.