Text Embedding 005
Text Embedding 005 is an English-language text embedding model with a 66.31% average Massive Text Embedding Benchmark (MTEB) score at 768 dimensions, supporting dynamic embedding sizes down to 256 dimensions to reduce storage and compute costs with minor performance tradeoffs.
import { embed } from 'ai';
const result = await embed({ model: 'google/text-embedding-005', value: 'Sunny day at the beach',})About Text Embedding 005
Text Embedding 005 is Google's English-language text embedding model built on the Gecko architecture, which uses knowledge distillation from large language models (LLMs) to achieve competitive downstream task performance at a compact embedding size. At its full 768-dimension output, the model scores 66.31% on the MTEB benchmark, a standard evaluation suite covering eight categories including retrieval, reranking, clustering, classification, and semantic similarity.
This means Text Embedding 005 delivers competitive retrieval and similarity quality without requiring high-dimensional vector indices.
Dynamic embedding sizes are supported through Matryoshka Representation Learning (MRL), which trains the model to produce accurate representations at multiple dimension levels from a single pass. At 256 dimensions, it scores 64.37% on MTEB, a two-point reduction that may be an acceptable tradeoff when vector storage costs at scale are significant. This flexibility is built into the model architecture, not post-hoc dimension reduction.