Mistral Embed
Mistral Embed is Mistral AI's general-purpose text embedding model with 1024 dimensions, designed for semantic search and retrieval tasks with a 55.26 score on the MTEB benchmark.
import { embed } from 'ai';
const result = await embed({ model: 'mistral/mistral-embed', value: 'Sunny day at the beach',})About Mistral Embed
Mistral Embed launched alongside La Plateforme as Mistral AI's retrieval-focused embedding endpoint. Mistral Embed produces 1024-dimensional vector representations and scores 55.26 on the Massive Text Embedding Benchmark (MTEB), a standard evaluation suite for embedding model quality.
The embedding space preserves semantic similarity for nearest-neighbor retrieval. Documents with similar meaning cluster closely, while semantically distinct texts land farther apart in the vector space.
Mistral Embed integrates into retrieval-augmented generation (RAG) architectures where a Mistral AI generation model handles question answering and Mistral Embed indexes the knowledge base. Using the same provider ecosystem for both embedding and generation simplifies the stack and keeps provider management consolidated through AI Gateway.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
More models by Mistral AI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: If your corpus is primarily source code rather than natural language, consider Codestral Embed, which was trained specifically on code and outperforms general embedding models on code retrieval benchmarks.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Mistral Embed
Best For
- Semantic search: Retrieval over natural-language document collections where lexical search falls short
- RAG pipelines: Pair embedding with Mistral AI generation models
- Document similarity and clustering: Grouping and deduplicating content for organization or analytics
- Recommendation systems: Recommender architectures based on textual content similarity
- Multilingual retrieval: Covering European languages supported by the Mistral AI ecosystem
Consider Alternatives When
- Source code corpus: Use Codestral Embed, which is specialized for code
- Variable dimension embeddings: You need support for storage cost optimization
- Domain-specific retrieval: Highly specialized text may benefit from fine-tuned embeddings
Conclusion
Mistral Embed is a general-purpose retrieval foundation for Mistral-based stacks. Mistral Embed's 1024-dimensional representations and MTEB-evaluated quality make it a choice for teams building semantic search and RAG systems that want to keep their provider footprint within the Mistral AI ecosystem.
Frequently Asked Questions
What are the embedding dimensions for Mistral Embed?
1024 dimensions per embedding vector.
What is the MTEB score for Mistral Embed?
55.26 on retrieval tasks within the Massive Text Embedding Benchmark.
What is Mistral Embed designed to optimize for?
Retrieval. The embedding space supports accurate nearest-neighbor search for semantic retrieval tasks.
Can I use Mistral Embed for non-English text?
Yes. Mistral's models broadly support European languages; check Mistral AI's documentation for specific language coverage in the embedding model.
How does Mistral Embed compare to Codestral Embed?
Mistral Embed is a general-purpose text embedding model. Codestral Embed was trained specifically for code and outperforms general models on code retrieval benchmarks. Use Codestral Embed when your corpus is source code.
Can I use Mistral Embed in a RAG pipeline with a Mistral AI generation model?
Yes. Pairing Mistral Embed for indexing with a Mistral AI instruct or reasoning model for generation is a well-supported pattern.
Is Mistral Embed available via the OpenAI Embeddings API format?
Yes. Access Mistral Embed through AI Gateway; see AI Gateway docs for the embedding API surface and request shape.