Skip to content

Voyage Rerank 2.5 Lite

Voyage Rerank 2.5 Lite is Voyage AI's generalist reranker balanced for latency and quality. It supports a context window of 32K tokens, instruction-following, and multilingual reranking. It improves accuracy by 7.16% over Cohere Rerank v3.5 across 93 retrieval datasets.

Rerank
index.ts
import { rerank } from 'ai';
const result = await rerank({
model: 'voyage/rerank-2.5-lite',
query: 'What is the capital of France?',
documents: [
'Paris is the capital of France.',
'Berlin is the capital of Germany.',
'Madrid is the capital of Spain.',
],
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Voyage AI
Legal:Terms
Privacy
32K
$0.02/M
08/11/2025

More models by Voyage AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$0.05/M
voyage logo
08/11/2025
$0.18/M
voyage logo
09/01/2024
$0.12/M
voyage logo
03/01/2024
32K
$0.02/M
voyage logo
32K
$0.12/M
voyage logo
32K
$0.06/M
voyage logo

About Voyage Rerank 2.5 Lite

Voyage Rerank 2.5 Lite is Voyage AI's generalist reranker released August 11, 2025, optimized for both latency and quality. It reorders candidate documents returned by a first-stage retriever, with a context window of 32K tokens, multilingual support, and the same instruction-following capability as rerank-2.5.

Across 93 retrieval datasets, Voyage Rerank 2.5 Lite improves accuracy by 7.16% over Cohere Rerank v3.5 when paired with four first-stage retrieval methods: BM25 lexical search, OpenAI text-embedding-3-large, voyage-3-large, and voyage-3.5. Averaged across those first-stage methods, Voyage Rerank 2.5 Lite outperforms Cohere Rerank v3.5, Qwen3-Reranker-8B, and rerank-2-lite by 1.93%, 1.01%, and 2.70% respectively on NDCG@10. It performs better than Qwen3-Reranker-8B, the strongest open-source reranker in the comparison, despite being over an order of magnitude smaller.

On the Massive Instructed Retrieval (MAIR) benchmark, Voyage Rerank 2.5 Lite outperforms Cohere Rerank v3.5 by 10.36%. Instruction-following lets you steer relevance scores using natural language without changing your retrieval index. The context window of 32K tokens matches rerank-2.5, so long query-document pairs fit a single rerank call.

What To Consider When Choosing a Provider

  • Configuration: Voyage Rerank 2.5 Lite sits in the latency-and-quality balanced tier of the Voyage AI 2.5 reranker series. Pick it when per-query cost and response time matter and you can accept a small accuracy tradeoff versus rerank-2.5. Most production RAG pipelines fall in this category.
  • Configuration: If retrieval quality is the dominant constraint, rerank-2.5 is the quality-optimized tier of the same series. The two models share the same context window, instruction-following capability, and multilingual coverage.
  • Configuration: Voyage Rerank 2.5 Lite pairs with any first-stage retriever. Keep your existing embedding stack and add Voyage Rerank 2.5 Lite as a second-stage reranker to lift top-k quality without changing how documents are indexed.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Voyage Rerank 2.5 Lite

Best For

  • High-volume RAG pipelines: Per-query reranking cost scales with traffic, and Voyage Rerank 2.5 Lite balances accuracy with throughput
  • Latency-sensitive search: Customer-facing search where end-to-end response time matters and a reranker stage must stay fast
  • Instruction-driven relevance: Encode preferences such as recency or source authority in natural language without retraining
  • Multilingual reranking: One reranker covers retrieval across many languages
  • Cost-sensitive production: Strong reranking quality without the full price tier of rerank-2.5

Consider Alternatives When

  • Top-tier accuracy is the priority: rerank-2.5 is the quality-optimized tier and improves on Cohere Rerank v3.5 by 7.94% versus 7.16% for Voyage Rerank 2.5 Lite
  • Open-source-only constraint: Qwen3-Reranker-8B is available for self-hosting, though Voyage Rerank 2.5 Lite outperforms it on Voyage's published benchmarks
  • Single-language English-only retrieval: A monolingual reranker may suffice when other languages are out of scope
  • No reranker stage needed: Strong first-stage retrievers like voyage-3.5 may meet your accuracy targets without a second pass

Conclusion

Voyage Rerank 2.5 Lite balances reranking quality and per-query cost in Voyage AI's 2.5 reranker series. Instruction-following, multilingual coverage, and a context window of 32K tokens let you upgrade RAG accuracy without retraining your retrieval stack. Route requests through AI Gateway to swap between Voyage Rerank 2.5 Lite and rerank-2.5 as your accuracy and cost targets evolve.

Frequently Asked Questions

  • What is the difference between Voyage Rerank 2.5 Lite and rerank-2.5?

    rerank-2.5 is the quality-optimized tier of the series and improves on Cohere Rerank v3.5 by 7.94%. Voyage Rerank 2.5 Lite is the latency-and-quality balanced tier and improves on Cohere Rerank v3.5 by 7.16%. Both support instruction-following, the same context window of 32K tokens, and multilingual retrieval.

  • How does Voyage Rerank 2.5 Lite compare to Qwen3-Reranker-8B?

    Voyage AI reports that Voyage Rerank 2.5 Lite performs better than Qwen3-Reranker-8B on the published benchmark suite despite being over an order of magnitude smaller. Averaged across four first-stage retrieval methods, Voyage Rerank 2.5 Lite outperforms Qwen3-Reranker-8B by 1.01% on NDCG@10.

  • What is instruction-following in Voyage Rerank 2.5 Lite?

    Instruction-following lets you steer relevance scores using natural language. You pass an instruction with the query and candidates, and Voyage Rerank 2.5 Lite adjusts scoring to reflect it. On the MAIR benchmark, this lifts Voyage Rerank 2.5 Lite 10.36% above Cohere Rerank v3.5.

  • What is the context window for Voyage Rerank 2.5 Lite?

    32K tokens. That is double the window of rerank-2-lite and eight times that of Cohere Rerank v3.5, so longer query-document pairs fit a single rerank call without truncation.

  • Which first-stage retrievers does Voyage Rerank 2.5 Lite work with?

    Voyage Rerank 2.5 Lite reorders candidates from any first-stage method. Voyage AI benchmarks it on BM25 lexical search, OpenAI text-embedding-3-large, voyage-3-large, and voyage-3.5. You can keep your existing embedding model and add Voyage Rerank 2.5 Lite as a second-stage reranker.

  • Does Voyage Rerank 2.5 Lite support multilingual retrieval?

    Yes. Voyage Rerank 2.5 Lite reranks across many languages without separate per-language models. Voyage AI reports consistent improvement across all evaluated languages and first-stage retrieval methods.

  • How do I access Voyage Rerank 2.5 Lite through Vercel AI Gateway?

    Add your Voyage AI API key in AI Gateway settings, then send rerank requests through AI Gateway. AI Gateway authenticates requests and records usage. You can call Voyage Rerank 2.5 Lite through the AI SDK alongside Chat Completions, Responses, and Messages API formats.

  • Is Zero Data Retention available for Voyage Rerank 2.5 Lite?

    Zero Data Retention is not currently available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.