Mistral Embed
Mistral Embed is Mistral AI's general-purpose text embedding model with 1024 dimensions, designed for semantic search and retrieval tasks with a 55.26 score on the MTEB benchmark.
import { embed } from 'ai';
const result = await embed({ model: 'mistral/mistral-embed', value: 'Sunny day at the beach',})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
If your corpus is primarily source code rather than natural language, consider Codestral Embed, which was trained specifically on code and outperforms general embedding models on code retrieval benchmarks.
When to Use Mistral Embed
Best For
Semantic search:
Retrieval over natural-language document collections where lexical search falls short
RAG pipelines:
Pair embedding with Mistral AI generation models
Document similarity and clustering:
Grouping and deduplicating content for organization or analytics
Recommendation systems:
Recommender architectures based on textual content similarity
Multilingual retrieval:
Covering European languages supported by the Mistral AI ecosystem
Consider Alternatives When
Source code corpus:
Use Codestral Embed, which is specialized for code
Variable dimension embeddings:
You need support for storage cost optimization
Domain-specific retrieval:
Highly specialized text may benefit from fine-tuned embeddings
Conclusion
Mistral Embed is a general-purpose retrieval foundation for Mistral-based stacks. Mistral Embed's 1024-dimensional representations and MTEB-evaluated quality make it a choice for teams building semantic search and RAG systems that want to keep their provider footprint within the Mistral AI ecosystem.
FAQ
1024 dimensions per embedding vector.
55.26 on retrieval tasks within the Massive Text Embedding Benchmark.
Retrieval. The embedding space supports accurate nearest-neighbor search for semantic retrieval tasks.
Yes. Mistral's models broadly support European languages; check Mistral AI's documentation for specific language coverage in the embedding model.
Mistral Embed is a general-purpose text embedding model. Codestral Embed was trained specifically for code and outperforms general models on code retrieval benchmarks. Use Codestral Embed when your corpus is source code.
Yes. Pairing Mistral Embed for indexing with a Mistral AI instruct or reasoning model for generation is a well-supported pattern.
Yes. Access Mistral Embed through AI Gateway; see AI Gateway docs for the embedding API surface and request shape.