What is the MTEB score for text-embedding-005?

Text Embedding 005 scores 66.31% on average on MTEB at 768 dimensions. At 256 dimensions using dynamic sizing, it scores 64.37%. MTEB covers eight English NLP task categories including retrieval, reranking, clustering, classification, and semantic similarity.

What is dynamic embedding size and how does it work?

The model uses Matryoshka Representation Learning to produce accurate embeddings at multiple dimension levels from a single inference pass. Users can request 256 or 768-dimension output. This is a training-level capability, not post-hoc truncation, which preserves more information per dimension at smaller sizes.

When should I choose 256 dimensions over 768 dimensions?

Use 256 dimensions when vector storage costs at scale are a constraint and the approximately 2-point MTEB score difference is acceptable for your use case. Use 768 dimensions for maximum retrieval accuracy.

Does text-embedding-005 support multilingual content?

No. This is an English-only model. For multilingual or non-English embedding tasks, use `google/text-multilingual-embedding-002`.

What tasks does this model perform well on?

The model performs well on retrieval, reranking, clustering, classification, and semantic similarity across the eight MTEB task categories.

What is the pricing for text-embedding-005?

Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves Text Embedding 005.

How does text-embedding-005 compare to larger embedding models?

At 768 dimensions, the model competes well against MTEB entries of the same embedding size and models with significantly more parameters or higher dimensionality. This makes it an efficient choice for applications where inference cost and vector storage size matter.

Text Embedding 005

Text Embedding 005 is an English-language text embedding model with a 66.31% average Massive Text Embedding Benchmark (MTEB) score at 768 dimensions, supporting dynamic embedding sizes down to 256 dimensions to reduce storage and compute costs with minor performance tradeoffs.

index.ts

import { embed } from 'ai';

const result = await embed({
  model: 'google/text-embedding-005',
  value: 'Sunny day at the beach',
})

Overview About Providers Similar FAQ

Frequently Asked Questions

What is the MTEB score for text-embedding-005?
Text Embedding 005 scores 66.31% on average on MTEB at 768 dimensions. At 256 dimensions using dynamic sizing, it scores 64.37%. MTEB covers eight English NLP task categories including retrieval, reranking, clustering, classification, and semantic similarity.
What is dynamic embedding size and how does it work?
The model uses Matryoshka Representation Learning to produce accurate embeddings at multiple dimension levels from a single inference pass. Users can request 256 or 768-dimension output. This is a training-level capability, not post-hoc truncation, which preserves more information per dimension at smaller sizes.
When should I choose 256 dimensions over 768 dimensions?
Use 256 dimensions when vector storage costs at scale are a constraint and the approximately 2-point MTEB score difference is acceptable for your use case. Use 768 dimensions for maximum retrieval accuracy.
Does text-embedding-005 support multilingual content?
No. This is an English-only model. For multilingual or non-English embedding tasks, use google/text-multilingual-embedding-002.
What tasks does this model perform well on?
The model performs well on retrieval, reranking, clustering, classification, and semantic similarity across the eight MTEB task categories.
What is the pricing for text-embedding-005?
Check the pricing panel on this page for today's numbers. AI Gateway tracks rates across every provider that serves Text Embedding 005.
How does text-embedding-005 compare to larger embedding models?
At 768 dimensions, the model competes well against MTEB entries of the same embedding size and models with significantly more parameters or higher dimensionality. This makes it an efficient choice for applications where inference cost and vector storage size matter.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Text Embedding 005

Frequently Asked Questions