Cohere Rerank 4 Pro
Cohere Rerank 4 Pro is a multilingual reranking model from Cohere built for state-of-the-art relevance on complex queries over English and non-English documents and semi-structured JSON.
import { rerank } from 'ai';
const result = await rerank({ model: 'cohere/rerank-v4-pro', query: 'What is the capital of France?', documents: [ 'Paris is the capital of France.', 'Berlin is the capital of Germany.', 'Madrid is the capital of Spain.', ],})About Cohere Rerank 4 Pro
Cohere Rerank 4 Pro is the quality tier of the Rerank 4 generation, released December 11, 2025 alongside rerank-v4-fast. Cohere positions it as the strongest Cohere reranker yet, aimed at enterprise search and RAG pipelines where ranking accuracy on complex queries directly drives downstream outcomes.
Reranking is a cross-encoder step. Cohere Rerank 4 Pro reads the query and each candidate document together with full attention, scoring relevance in a way that bi-encoder embedding similarity cannot. Multi-part queries, queries with conditions, and queries that hinge on a single phrase inside a long document benefit most from this setup.
Multilingual coverage spans more than 100 languages. A query in one language can match documents in another within the same rerank call, so a single index can serve a global user base without separate per-language pipelines. Document types include long-form text, tables, code, and semi-structured JSON.
In a RAG pipeline, the common pattern is to retrieve 50 to 200 candidates with an embedding model, then rerank them with Cohere Rerank 4 Pro down to a top-k of 5 to 20 documents handed to the generative model. The reranker reduces noise in the LLM context, which often improves answer quality more than swapping in a larger generative model would.
See https://cohere.com/blog/rerank-4 for the API contract. Reranking is billed per search query, so cost scales with traffic rather than document length or token count.