Skip to content

Cohere Rerank 3.5

Cohere Rerank 3.5 is a reranking model from Cohere that reorders retrieved English documents and semi-structured JSON by semantic relevance to a query, sharpening the top-k results in a RAG pipeline.

Rerank
index.ts
import { rerank } from 'ai';
const result = await rerank({
model: 'cohere/rerank-v3.5',
query: 'What is the capital of France?',
documents: [
'Paris is the capital of France.',
'Berlin is the capital of Germany.',
'Madrid is the capital of Spain.',
],
})

Frequently Asked Questions

  • What does Cohere Rerank 3.5 actually do?

    It takes a query and a list of candidate documents, scores each document for semantic relevance to the query, and returns the list reordered. The model itself doesn't retrieve; it refines results from a first-pass retriever like a vector index or BM25 search.

  • How does Cohere Rerank 3.5 differ from an embedding model?

    Embedding models produce a vector per document and a vector per query, then similarity is computed offline. Cohere Rerank 3.5 reads the query and a candidate document together through cross-attention, which catches relevance signal that bi-encoder similarity misses on complex queries. The tradeoff is that reranking runs per-candidate at query time, so it's used on a shortlist rather than the full corpus.

  • What types of documents does Cohere Rerank 3.5 support?

    English-language text including long-form documents, semi-structured JSON, tables, and email-style records. The per-document context window is 4.1K tokens, shared between the query and the document.

  • How many documents should I send per rerank call?

    Typical pipelines retrieve 50 to 200 candidates from a first-pass index and rerank them down to a top-k of 5 to 20 for the generative model. The exact numbers depend on your latency budget and how noisy the first-pass retriever is.

  • Can I use Cohere Rerank 3.5 without a separate embedding model?

    You can pair it with any retriever, including BM25 or hybrid search. An embedding model is the most common first stage, but Cohere Rerank 3.5 only needs a candidate set, not a specific retrieval method.

  • How much does Cohere Rerank 3.5 cost on AI Gateway?

    Reranking is billed per search query. See the pricing section on this page for the current per-query rate on AI Gateway.

  • Does Cohere Rerank 3.5 support Zero Data Retention?

    Yes, Zero Data Retention is available for this model. Zero Data Retention is offered on a per-provider basis. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.