Is text-embedding-3-small a direct replacement for ada-002?

Yes. The default output is 1536 dimensions, same as ada-002, so existing vector indexes work without rebuilding. You get a higher MTEB score and immediate cost savings.

How much does the multilingual retrieval improve over ada-002?

MIRACL scores go from 31.4% to 44.0%. For pipelines that handle queries or documents in multiple languages, this is a meaningful quality improvement that comes free with the model swap.

When does it make sense to pay for text-embedding-3-large instead?

When your application's quality is bottlenecked by embedding accuracy, for example, legal search, scientific literature retrieval, or high-stakes recommendation systems where a 2-point MTEB difference translates to noticeably better results.

Can I reduce the vector dimensions below 1536?

Yes. The `dimensions` parameter accepts any value below the default. Matryoshka training ensures the truncated vectors retain useful semantic structure, which is helpful for reducing storage costs in large indexes.

What are typical latency characteristics?

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway embedding traffic.

Dashboard

text-embedding-3-small

text-embedding-3-small delivers higher MTEB scores than ada-002 at lower cost, with a 1536-dimension default that drops into existing pipelines and a flexible dimensions parameter for further storage savings.

index.ts

import { embed } from 'ai';

const result = await embed({
  model: 'openai/text-embedding-3-small',
  value: 'Sunny day at the beach',
})

Overview About Providers Similar FAQ

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

$0.02/M

—

01/25/2024

Legal:Terms

•

Privacy

$0.02/M

—

01/25/2024

More models by OpenAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

0.9s

69tps

$5.00/M

$30.00/M

Read:

$0.5/M

Write:

—

$10.00/K

+ input costs

—

04/24/2026

400K

1.2s

157tps

$0.75/M

$4.50/M

Read:$0.07/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

400K

0.6s

39tps

$0.20/M

$1.25/M

Read:$0.02/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

1.1M

1.0s

59tps

$2.50/M

$15.00/M

Read:

$0.25/M

Write:

—

$10.00/K

+ input costs

—

03/05/2026

128K

0.5s

98tps

$1.25/M

$10.00/M

Read:$0.13/M

Write:—

$10.00/K

+ input costs

—

11/12/2025

131K

0.1s

1403tps

$0.35/M

$0.75/M

Read:$0.25/M

Write:—

—

08/05/2025

About text-embedding-3-small

OpenAI announced text-embedding-3-small on January 25, 2024 alongside its larger sibling. At reduced per-token cost compared to ada-002, it scores 62.3% on MTEB (Massive Text Embedding Benchmark), 1.3 points higher than the model it replaces.

Multilingual retrieval improves substantially too. text-embedding-3-small reaches 44.0% on MIRACL versus ada-002's 31.4%, a 12.6-point gain that matters for any pipeline handling non-English or mixed-language content.

Like text-embedding-3-large, text-embedding-3-small supports the dimensions parameter via Matryoshka training. The default 1,536-dimension output matches ada-002 for drop-in compatibility, but you can reduce it when memory or storage costs are a concern. Semantic structure is front-loaded into the earlier dimensions, so shorter vectors still carry meaningful signal.

text-embedding-3-small fits well in the query-time embedding path of Retrieval-Augmented Generation (RAG) architectures. Every user query must be embedded before retrieval, and at scale that per-query cost and latency compounds. Low cost and fast inference make it a natural fit for that leg of the pipeline.

What To Consider When Choosing a Provider

Configuration: Because the default output is 1536 dimensions, identical to ada-002, you can swap models without touching your vector database schema or index configuration.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use text-embedding-3-small

Best For

Ada-002 drop-in replacement: No schema changes and immediate cost savings with the same 1536-dimension default
Real-time RAG pipelines: Query-time embedding where latency and cost per request matter
High-volume indexing: Document pipelines where the reduced cost yields significant infrastructure savings
Multilingual retrieval: Applications that benefit from the 12.6-point MIRACL improvement over ada-002
Budget-conscious projects: Embedding quality is important but not the absolute ceiling

Consider Alternatives When

Retrieval accuracy bottleneck: The 2.3-point MTEB gap versus text-embedding-3-large is material for your use case
Heavily multilingual corpus: The larger model's 54.9% MIRACL score versus 44.0% would produce meaningfully better results
Maximum dimensionality: Specialized downstream models need the full 3072-dimension vectors from text-embedding-3-large

Conclusion

text-embedding-3-small is the practical default embedding model for most applications on AI Gateway. It costs less than ada-002, performs better, and drops in without migration pain. Start here unless you have a specific reason to pay for the large variant's extra accuracy.

Frequently Asked Questions

Is text-embedding-3-small a direct replacement for ada-002?
Yes. The default output is 1536 dimensions, same as ada-002, so existing vector indexes work without rebuilding. You get a higher MTEB score and immediate cost savings.
How much does the multilingual retrieval improve over ada-002?
MIRACL scores go from 31.4% to 44.0%. For pipelines that handle queries or documents in multiple languages, this is a meaningful quality improvement that comes free with the model swap.
When does it make sense to pay for text-embedding-3-large instead?
When your application's quality is bottlenecked by embedding accuracy, for example, legal search, scientific literature retrieval, or high-stakes recommendation systems where a 2-point MTEB difference translates to noticeably better results.
Can I reduce the vector dimensions below 1536?
Yes. The dimensions parameter accepts any value below the default. Matryoshka training ensures the truncated vectors retain useful semantic structure, which is helpful for reducing storage costs in large indexes.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway embedding traffic.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

text-embedding-3-small

Providers

More models by OpenAI

About text-embedding-3-small

What To Consider When Choosing a Provider

When to Use text-embedding-3-small

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions