Skip to content

Text Multilingual Embedding 002

Text Multilingual Embedding 002 is an 18-language text embedding model achieving a 56.2% average score on the Massive Information Retrieval Across Languages (MIRACL) benchmark, designed for cross-lingual semantic search and retrieval across diverse language corpora.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'google/text-multilingual-embedding-002',
value: 'Sunny day at the beach',
})

About Text Multilingual Embedding 002

Text-multilingual-embedding-002 is Google's embedding model purpose-built for multilingual natural language processing (NLP) applications. Released alongside text-embedding-005 at Google Cloud Next '24, it uses the same Gecko architecture but targets cross-lingual coverage rather than maximum English-language benchmark performance. Its primary evaluation benchmark is MIRACL (Massive Information Retrieval Across Languages), covering 18 languages, where it achieves a 56.2% average score.

The practical value lies in vector space alignment across languages. Rather than running separate monolingual models for each language in your corpus, text-multilingual-embedding-002 embeds content from all 18 supported languages into a shared semantic space. A query submitted in one language can surface relevant documents written in any other supported language, without a translation step. For global products, international content platforms, or multilingual knowledge bases, this shared embedding space eliminates the complexity of language detection and routing.

Like its English-only sibling, text-multilingual-embedding-002 supports dynamic embedding sizes through Matryoshka Representation Learning (MRL). You can choose smaller dimension outputs to reduce vector storage and compute costs, with a minor quality tradeoff. This flexibility matters for multilingual applications where the corpus may be significantly larger than a monolingual equivalent.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Google Vertex
Legal:Terms
Privacy
$0.03/M
03/01/2024
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

More models by Google

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.9s
245tps
$0.25/M$1.50/M
Read:$0.03/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
03/03/2026
1M
3.2s
180tps
$2.00/M
$12.00/M
Read:
$0.2/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
02/19/2026
1M
0.6s
188tps
$0.50/M
$3.00/M
Read:
$0.05/M
Write:
$14.00/K
+ input costs
google logo
vertex logo
12/17/2025
1M
0.3s
310tps
$0.10/M$0.40/M
Read:$0.01/M
Write:
$35.00/K
+ input costs
google logo
vertex logo
06/17/2025
1M
0.4s
216tps
$0.30/M$2.50/M
Read:$0.03/M
Write:
$35.00/K
+ input costs
google logo
vertex logo
03/20/2025
1M
1.9s
133tps
$1.25/M
$10.00/M
Read:
$0.13/M
Write:
$35.00/K
+ input costs
google logo
vertex logo
03/20/2025

What To Consider When Choosing a Provider

  • Configuration: For multilingual retrieval applications, this model maps text from all supported languages into the same vector space. That enables cross-lingual queries: for example, a user querying in Japanese can retrieve documents written in Spanish without a query translation layer.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Text Multilingual Embedding 002

Best For

  • Multilingual semantic search: Applications serving users who query in different languages than the indexed content
  • Cross-lingual document retrieval: Knowledge base search across international content corpora
  • Global customer support: Systems where user questions and knowledge base articles span multiple languages
  • Multilingual clustering and classification: Tasks that need consistent semantic representations across languages
  • International content platforms: E-commerce or media indexing product descriptions or articles in multiple languages

Consider Alternatives When

  • English-only corpus: Your corpus and users are exclusively English-language (consider google/text-embedding-005 for higher MTEB scores)
  • Unsupported language needed: You require a language not covered by the 18-language MIRACL benchmark, verify support in the Vertex AI documentation
  • Peak English retrieval quality: Multilingual support is not required and maximum English performance is the primary criterion

Conclusion

Text-multilingual-embedding-002 solves the core infrastructure challenge of multilingual retrieval: maintaining a single vector index that serves queries and documents across 18 languages without translation layers or per-language model management. For global applications where your user base and content corpus span multiple languages, it provides the embedding foundation that makes cross-lingual semantic search tractable.

Frequently Asked Questions

  • How many languages does text-multilingual-embedding-002 support?

    The model is evaluated on MIRACL, which covers 18 languages. Text Multilingual Embedding 002 scores 56.2% on average on this benchmark. Consult the Vertex AI documentation for the complete list of supported languages.

  • What is MIRACL and how does it differ from MTEB?

    MIRACL (Massive Information Retrieval Across Languages) is a multilingual retrieval benchmark covering 18 languages, used to evaluate cross-lingual information retrieval quality. MTEB is an English-language benchmark covering eight task categories. The two models in this family are each evaluated on the benchmark most relevant to their design target.

  • Can users query in one language and retrieve results in another?

    Yes. This is the key capability of a shared multilingual embedding space. Text from all supported languages is mapped into the same vector space, so a query in Japanese and a matching document in Arabic will have similar vector representations, enabling cross-lingual retrieval without query translation.

  • Does this model support dynamic embedding sizes?

    Yes. Like text-embedding-005, it uses Matryoshka Representation Learning to support multiple output dimension sizes. Smaller dimensions reduce vector storage and compute costs with a minor quality tradeoff.

  • When should I use this model versus text-embedding-005?

    Use text-multilingual-embedding-002 whenever your application must handle content or queries in multiple languages. Use text-embedding-005 for strictly English-language applications where maximum MTEB benchmark performance is the priority.

  • What is the pricing for this model?

    Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves Text Multilingual Embedding 002.

  • Does this model work for cross-lingual classification tasks?

    Yes. The shared vector space means that classifiers trained on labeled data in one language can classify documents in other supported languages, which is useful for content moderation, sentiment analysis, and topic categorization across multilingual corpora.

  • Do I need to detect the language of input text before embedding it?

    No. The model handles all 18 supported languages from a single endpoint. Language detection and routing are not required: submit text in any supported language and the model produces an embedding in the shared multilingual vector space.