Skip to content

Cohere Rerank 4 Pro

Cohere Rerank 4 Pro is a multilingual reranking model from Cohere built for state-of-the-art relevance on complex queries over English and non-English documents and semi-structured JSON.

Rerank
index.ts
import { rerank } from 'ai';
const result = await rerank({
model: 'cohere/rerank-v4-pro',
query: 'What is the capital of France?',
documents: [
'Paris is the capital of France.',
'Berlin is the capital of Germany.',
'Madrid is the capital of Spain.',
],
})

About Cohere Rerank 4 Pro

Cohere Rerank 4 Pro is the quality tier of the Rerank 4 generation, released December 11, 2025 alongside rerank-v4-fast. Cohere positions it as the strongest Cohere reranker yet, aimed at enterprise search and RAG pipelines where ranking accuracy on complex queries directly drives downstream outcomes.

Reranking is a cross-encoder step. Cohere Rerank 4 Pro reads the query and each candidate document together with full attention, scoring relevance in a way that bi-encoder embedding similarity cannot. Multi-part queries, queries with conditions, and queries that hinge on a single phrase inside a long document benefit most from this setup.

Multilingual coverage spans more than 100 languages. A query in one language can match documents in another within the same rerank call, so a single index can serve a global user base without separate per-language pipelines. Document types include long-form text, tables, code, and semi-structured JSON.

In a RAG pipeline, the common pattern is to retrieve 50 to 200 candidates with an embedding model, then rerank them with Cohere Rerank 4 Pro down to a top-k of 5 to 20 documents handed to the generative model. The reranker reduces noise in the LLM context, which often improves answer quality more than swapping in a larger generative model would.

See https://cohere.com/blog/rerank-4 for the API contract. Reranking is billed per search query, so cost scales with traffic rather than document length or token count.