Skip to content

Embed v4.0

Embed v4.0 is a multimodal embedding model from Cohere that converts text, images, or mixed content into vector representations for classification and semantic search.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'cohere/embed-v4.0',
value: 'Sunny day at the beach',
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Cohere
Legal:Terms
Privacy
$0.12/M
04/15/2025

More models by Cohere

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$2/K
cohere logo
12/11/2025
32K
$2.5/K
cohere logo
12/11/2025
256K
0.8s
53tps
$2.50/M$10.00/M
cohere logo
03/13/2025
4K
$2/K
bedrock logo
12/02/2024

About Embed v4.0

Embed v4.0 is Cohere's fourth-generation embedding model, released April 15, 2025. It reaches a 65.2 MTEB score, ahead of OpenAI's text-embedding-3-large (64.6). Beyond text-only retrieval, it embeds interleaved text and images in the same vector space. You can index screenshots of PDFs, slides, figures, and tables directly alongside text documents without converting visual content to text first. That removes a common preprocessing step in document-heavy RAG pipelines.

The architecture supports four output dimensions: 256, 512, 1,024, and 1,536. It also supports Matryoshka-style nested representations, so you can truncate a full-resolution embedding to a smaller size with limited quality loss. You can tune the cost-versus-accuracy tradeoff at query time or build tiered retrieval systems. Use compressed vectors for lightweight candidate retrieval and full-resolution embeddings for re-ranking. The default dimension is 1,536.

Cohere describes the model as helping organizations "securely retrieve their multimodal data to build agentic AI applications." Its multimodal input coverage (text, images, and interleaved combinations) fits knowledge bases with mixed-format assets: technical documentation with embedded diagrams, investor presentations, and research reports with figures. You don't need separate embedding models per content type.

Embedding input is billed at $0.12 per million tokens at listed AI Gateway rates. See https://docs.cohere.com/docs/cohere-embed for request formats and limits.

What To Consider When Choosing a Provider

  • Configuration: Before you index large document corpora, confirm your vector database supports the embedding dimension you select. Embed v4.0 offers four output sizes: 256, 512, 1,024, or 1,536.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Embed v4.0

Best For

  • Mixed-format RAG: PDFs with charts, slide decks, and tables share a unified embedding space without separate OCR or image description pipelines
  • Multimodal semantic search: Image queries match text results and text queries surface image-based content
  • Storage-sensitive indexes: Matryoshka truncation reduces dimensions (for example, 1,536 to 256) with a controlled accuracy tradeoff
  • Multilingual retrieval: Cross-lingual similarity matching works within a single index
  • Agentic search systems: One embedding endpoint covers diverse document types like reports, spreadsheets, and slides

Consider Alternatives When

  • Plain text only corpus: A text-only embedding model would simplify integration with no visual content
  • Maximum throughput priority: A smaller or quantized model meets cost targets at lower accuracy
  • Unsupported dimensions: Your vector database requires embedding dimensions outside the four sizes (256, 512, 1,024, 1,536) that Embed v4.0 supports

Conclusion

Embed v4.0 brings text and visual content into a shared embedding space for enterprise retrieval. Its 65.2 MTEB score and configurable Matryoshka dimensions give you one model that balances retrieval accuracy with storage and latency constraints across diverse knowledge bases. Route it through AI Gateway with model id cohere/embed-v4.0.

Frequently Asked Questions

  • What makes Embed v4.0 a multimodal embedding model?

    It embeds text, images, and interleaved text-image content in one unified vector space. A screenshot of a PDF slide and a text query about its contents can be compared directly without preprocessing the image into text.

  • What is the MTEB score for Embed v4.0?

    65.2 on MTEB (Massive Text Embedding Benchmark), above OpenAI's text-embedding-3-large (64.6).

  • What are Matryoshka embeddings and how do they work in Embed v4.0?

    Matryoshka embeddings let you truncate a full-dimension vector to a smaller size with limited quality loss. For example, you can store embeddings at 256 dimensions instead of 1,536. This enables tiered retrieval systems and storage savings without retraining or re-indexing.

  • What embedding dimensions does Embed v4.0 support?

    Four output dimensions: 256, 512, 1,024, and 1,536. The default is 1,536. Choose 512 or 1,024 to balance accuracy with storage cost.

  • Can Embed v4.0 embed visual content from enterprise documents?

    Yes. It captures visual features from screenshots of PDFs, slides, tables, and figures. You can index visual enterprise content alongside text directly.

  • Which document types benefit most from Embed v4.0's multimodal capabilities?

    Technical documentation with embedded diagrams, investor presentations, research reports with charts and figures, and any mixed-format knowledge base where OCR-based text extraction would lose structural or visual context.

  • Does Embed v4.0 support multilingual retrieval?

    Yes. Cohere trained it for English and multilingual settings. You can run cross-lingual similarity search within the same vector index.

  • How much does Embed v4.0 cost on AI Gateway?

    Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.