Skip to content
Dashboard

Titan Text Embeddings V2

Titan Text Embeddings V2 is Amazon's text embedding model tuned for retrieval-augmented generation (RAG). Vectors only; no output token charge. You can choose 256-, 512-, or 1024-dimensional output vectors, with support for 100+ languages.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'amazon/titan-embed-text-v2',
value: 'Sunny day at the beach',
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Amazon Bedrock
$0.02/M——
04/01/2024

More models by Amazon

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
300K
0.3s
$0.06/M$0.24/M——
bedrock logo
12/03/2024
300K
0.3s
122tps
$0.80/M$3.20/M——
bedrock logo
12/03/2024
1M
0.4s
225tps
$0.30/M$2.50/M
Read:$0.07/M
Write:—
——
bedrock logo
12/01/2024
128K
0.3s
$0.04/M$0.14/M——
bedrock logo

About Titan Text Embeddings V2

Titan Text Embeddings V2 is a Bedrock embedding model for enterprise retrieval: RAG, semantic search, multilingual indexes, and general text or code chunks. It accepts a maximum of 8,192 tokens per input (about 50,000 English characters as a rough guide) and returns a dense vector for cosine or similar similarity search.

You choose 256-, 512-, or 1024-dimensional output. Titan Text Embeddings V2 reports that 512-dimensional vectors keep about 99% of the accuracy of 1024-dimensional vectors, and 256-dimensional vectors keep about 97%, which trims storage and often latency on large indexes.

V2 adds improved unit-vector normalization options for similarity scoring. Titan Text Embeddings V2 was pre-trained on 100+ languages and on code, so one index can cover mixed-language corpora when your evaluation says recall stays high enough.

What To Consider When Choosing a Provider

  • Configuration: Select your output dimension before deploying to production. Once vectors are stored in your vector database at a given dimension, changing the size requires reindexing your entire corpus.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Titan Text Embeddings V2

Best For

  • Cost-efficient RAG pipelines: High-accuracy embedding model with flexible vector dimensions
  • Storage-constrained knowledge bases: The 512-dimension option retains about 99% accuracy at lower storage cost
  • Multilingual document collections: A single embedding model covers 100+ languages without separate per-language models
  • Bedrock-integrated search: Semantic search and document reranking workflows integrated with Amazon Bedrock
  • General-purpose code search: Code search and classification tasks that benefit from a dense embedding rather than a code-specific model

Consider Alternatives When

  • Multimodal embedding requirements: Amazon Nova Multimodal Embeddings may be more appropriate for text, image, and video in the same vector space
  • Code-specific retrieval: A model tuned specifically for code retrieval outperforms general-purpose text embeddings
  • Longer input token limit: English-only retrieval corpora may benefit from an embedding model with a larger token input capacity

Conclusion

Titan Text Embeddings V2 lets you trade storage and latency against accuracy by picking vector width, with normalization tuned for typical RAG similarity. It fits Bedrock-centric RAG stacks you already run through AI Gateway.