Skip to content
Dashboard

Voyage 4 Lite

Voyage 4 Lite is the lightweight Voyage 4 model. Voyage AI reports it approaches voyage-3.5 retrieval accuracy with fewer parameters, shares one embedding space with voyage-4-large and voyage-4, and supports a context window of 32K tokens with Matryoshka dimensions and quantization like the rest of the family.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-4-lite',
value: 'Sunny day at the beach',
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Voyage AI
32K
$0.02/M——
01/15/2026

More models by Voyage AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$0.06/M——
voyage logo
01/15/2026
32K
$0.12/M——
voyage logo
01/15/2026
32K
$0.05/M——
voyage logo
08/11/2025
32K
$0.02/M——
voyage logo
08/11/2025
$0.02/M——
voyage logo
05/20/2025
$0.06/M——
voyage logo
05/20/2025

About Voyage 4 Lite

Voyage 4 Lite strips down the Voyage 4 architecture to fewer parameters, released January 15, 2026. The result is a model that processes tokens faster and cheaper than its siblings while retaining enough retrieval quality for most production use cases.

Voyage AI benchmarks Voyage 4 Lite near voyage-3.5 retrieval accuracy. For teams running millions of daily requests or indexing large corpora on a budget, the per-token savings add up fast. Development and staging environments also benefit: cheaper iteration cycles let you experiment with chunking strategies and retrieval pipelines without burning through credits.

Because all Voyage 4 models produce compatible vectors, you aren't locked into Voyage 4 Lite for every step of your pipeline. Index your corpus with a stronger variant, then point live traffic at Voyage 4 Lite for lower query costs. No re-indexing required.

What To Consider When Choosing a Provider

  • Configuration: Use Voyage 4 Lite for queries when voyage-4-large already holds your document vectors, or as a budget option when both sides use the same model and your accuracy targets match Voyage AI's voyage-3.5 positioning. Plan a full re-embed when moving into Voyage 4, and test on a sample before indexing the full corpus.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Voyage 4 Lite

Best For

  • High query traffic: Pair Voyage 4 Lite queries with voyage-4-large document embeddings to keep per-query cost low without re-indexing
  • Cost-sensitive symmetric indexing: Voyage 4 Lite on both sides when voyage-3.5-level retrieval accuracy is sufficient and per-token cost drives the decision
  • Early production and prototypes: Iterate cheaply before upgrading query-side models once traffic patterns stabilize
  • Batch jobs: Large-corpus indexing runs where per-token cost compounds across millions of requests

Consider Alternatives When

  • Higher published average scores: Use voyage-4-large or voyage-4 when retrieval accuracy matters more than per-token cost
  • Code-only corpora: Use voyage-code-3 for repositories where source code is the primary content type
  • Multimodal embeddings: Use a model with native image inputs when you need to embed diagrams, screenshots, or mixed-format documents

Conclusion

Pick Voyage 4 Lite when your embedding bill scales with request volume and you need Voyage 4 generation quality at the tightest possible price point. Route through AI Gateway to swap between Voyage 4 tiers without changing your integration.