Skip to content

Voyage 3.5 Lite

Voyage 3.5 Lite is Voyage AI's cost-efficient embedding model with a context window of 0 tokens. It outperforms OpenAI text-embedding-3-large by 6.34% and achieves retrieval quality within 0.3% of Cohere Embed v4 at one-sixth the cost.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-3.5-lite',
value: 'Sunny day at the beach',
})

About Voyage 3.5 Lite

Voyage 3.5 Lite is Voyage AI's cost-efficient embedding model, released May 20, 2025. It supports a context window of 0 tokens and produces embeddings in four dimensions: 2048, 1024, 512, and 256. Voyage 3.5 Lite outperforms OpenAI text-embedding-3-large by 6.34% and its predecessor voyage-3-lite by 4.28% on average across eight retrieval domains.

Voyage 3.5 Lite achieves retrieval quality within 0.3% of Cohere Embed v4 at one-sixth the cost. That makes it a practical choice for high-volume embedding workloads where per-token pricing matters. It supports the same Matryoshka dimensionality and quantization-aware training as the premium voyage-3.5, including 32-bit float, 8-bit integer, and binary precision formats. Binary rescoring yields up to 6.89% quality improvement.

If you run large-scale RAG pipelines or semantic search over millions of documents, Voyage 3.5 Lite keeps embedding infrastructure affordable without dropping to a lower quality tier. Voyage AI recommends it for cost-sensitive production deployments.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Voyage AI
Legal:Terms
Privacy
$0.02/M
05/20/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

More models by Voyage AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$0.05/M
voyage logo
08/11/2025
32K
$0.02/M
voyage logo
08/11/2025
$0.18/M
voyage logo
09/01/2024
32K
$0.02/M
voyage logo
32K
$0.12/M
voyage logo
32K
$0.06/M
voyage logo

What To Consider When Choosing a Provider

  • Configuration: Voyage 3.5 Lite costs one-third the price of voyage-3.5 while maintaining high retrieval quality. The accuracy gap is small enough that most production workloads will not see a meaningful difference in end-user outcomes.
  • Configuration: If you embed millions of documents or process high query volumes, the per-token savings compound significantly. Voyage 3.5 Lite is designed for this scale.
  • Configuration: Voyage 3.5 Lite supports the same dimensionality and precision options as voyage-3.5. Combine reduced dimensions with int8 or binary precision for maximum cost savings on very large indices.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Voyage 3.5 Lite

Best For

  • High-volume RAG pipelines: Per-token embedding cost scales linearly with corpus size and query volume
  • Semantic search: Retrieval within 0.3% of Cohere Embed v4 at one-sixth the cost across large document collections
  • Startup and growth-stage teams: Production-quality embeddings without enterprise-tier pricing
  • Prototyping and experimentation: Lower cost enables faster iteration on retrieval strategies
  • Multi-domain retrieval: The same eight domains as voyage-3.5 at a lower price point

Consider Alternatives When

  • Retrieval accuracy is the main priority: Voyage-3.5 delivers the highest retrieval accuracy among Voyage 3.5 general-purpose options
  • Your corpus is exclusively source code: Voyage-code-3 is purpose-built for that domain
  • You need multimodal embeddings: Voyage-3.5-lite is text-only; pick a model with native image inputs when screenshots or diagrams go into the same index
  • You require the absolute lowest latency: A smaller, faster model better fits tight SLAs

Conclusion

Voyage 3.5 Lite balances per-token cost and retrieval quality in Voyage AI's embedding lineup. It outperforms OpenAI text-embedding-3-large by 6.34% while matching Cohere Embed v4 quality at a fraction of the price. If you embed at scale, it gives you production-grade retrieval without overcommitting on infrastructure costs. Access it through AI Gateway for unified provider management.

Frequently Asked Questions

  • How does Voyage 3.5 Lite compare to voyage-3.5?

    Voyage 3.5 Lite costs one-third the price of voyage-3.5. It achieves retrieval quality within 0.3% of Cohere Embed v4, while voyage-3.5 surpasses Cohere Embed v4 by 1.63%. For most production workloads, the accuracy difference is small relative to the cost savings.

  • What embedding dimensions does Voyage 3.5 Lite support?

    Four dimensions: 2048, 1024, 512, and 256. The same Matryoshka dimensionality as voyage-3.5, with the same quantization options for further storage savings.

  • Is Voyage 3.5 Lite suitable for production use?

    Yes. Voyage AI recommends it for cost-sensitive production deployments. It outperforms OpenAI text-embedding-3-large by 6.34% across eight retrieval domains.

  • What quantization formats does Voyage 3.5 Lite support?

    32-bit float, 8-bit integer, and binary precision. Binary rescoring yields up to 6.89% quality improvement, making aggressive compression practical at scale.

  • How much cheaper is Voyage 3.5 Lite than competitors?

    Voyage 3.5 Lite achieves retrieval quality within 0.3% of Cohere Embed v4 at one-sixth the cost. It also outperforms OpenAI text-embedding-3-large while maintaining a lower per-token price.

  • How do I route Voyage 3.5 Lite through Vercel AI Gateway?

    Add your Voyage AI API key in AI Gateway settings, then send embedding requests through AI Gateway. AI Gateway authenticates requests and records usage across embedding providers.

  • Can I migrate from voyage-3-lite to Voyage 3.5 Lite?

    Yes. Voyage 3.5 Lite outperforms voyage-3-lite by 4.28% on average. Migration requires re-embedding your corpus since the model weights are different.