Skip to content
Vercel April 2026 security incident

Mistral Embed

mistral/mistral-embed

Mistral Embed is Mistral AI's general-purpose text embedding model with 1024 dimensions, designed for semantic search and retrieval tasks with a 55.26 score on the MTEB benchmark.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'mistral/mistral-embed',
value: 'Sunny day at the beach',
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

If your corpus is primarily source code rather than natural language, consider Codestral Embed, which was trained specifically on code and outperforms general embedding models on code retrieval benchmarks.

When to Use Mistral Embed

Best For

  • Semantic search:

    Retrieval over natural-language document collections where lexical search falls short

  • RAG pipelines:

    Pair embedding with Mistral AI generation models

  • Document similarity and clustering:

    Grouping and deduplicating content for organization or analytics

  • Recommendation systems:

    Recommender architectures based on textual content similarity

  • Multilingual retrieval:

    Covering European languages supported by the Mistral AI ecosystem

Consider Alternatives When

  • Source code corpus:

    Use Codestral Embed, which is specialized for code

  • Variable dimension embeddings:

    You need support for storage cost optimization

  • Domain-specific retrieval:

    Highly specialized text may benefit from fine-tuned embeddings

Conclusion

Mistral Embed is a general-purpose retrieval foundation for Mistral-based stacks. Mistral Embed's 1024-dimensional representations and MTEB-evaluated quality make it a choice for teams building semantic search and RAG systems that want to keep their provider footprint within the Mistral AI ecosystem.

FAQ

1024 dimensions per embedding vector.

55.26 on retrieval tasks within the Massive Text Embedding Benchmark.

Retrieval. The embedding space supports accurate nearest-neighbor search for semantic retrieval tasks.

Yes. Mistral's models broadly support European languages; check Mistral AI's documentation for specific language coverage in the embedding model.

Mistral Embed is a general-purpose text embedding model. Codestral Embed was trained specifically for code and outperforms general models on code retrieval benchmarks. Use Codestral Embed when your corpus is source code.

Yes. Pairing Mistral Embed for indexing with a Mistral AI instruct or reasoning model for generation is a well-supported pattern.

Yes. Access Mistral Embed through AI Gateway; see AI Gateway docs for the embedding API surface and request shape.