Titan Text Embeddings V2
Titan Text Embeddings V2 is Amazon's text embedding model tuned for retrieval-augmented generation (RAG). Vectors only; no output token charge. You can choose 256-, 512-, or 1024-dimensional output vectors, with support for 100+ languages.
import { embed } from 'ai';
const result = await embed({ model: 'amazon/titan-embed-text-v2', value: 'Sunny day at the beach',})Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
More models by Amazon
| Model |
|---|
About Titan Text Embeddings V2
Titan Text Embeddings V2 is a Bedrock embedding model for enterprise retrieval: RAG, semantic search, multilingual indexes, and general text or code chunks. It accepts a maximum of 8,192 tokens per input (about 50,000 English characters as a rough guide) and returns a dense vector for cosine or similar similarity search.
You choose 256-, 512-, or 1024-dimensional output. Titan Text Embeddings V2 reports that 512-dimensional vectors keep about 99% of the accuracy of 1024-dimensional vectors, and 256-dimensional vectors keep about 97%, which trims storage and often latency on large indexes.
V2 adds improved unit-vector normalization options for similarity scoring. Titan Text Embeddings V2 was pre-trained on 100+ languages and on code, so one index can cover mixed-language corpora when your evaluation says recall stays high enough.
What To Consider When Choosing a Provider
- Configuration: Select your output dimension before deploying to production. Once vectors are stored in your vector database at a given dimension, changing the size requires reindexing your entire corpus.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Titan Text Embeddings V2
Best For
- Cost-efficient RAG pipelines: High-accuracy embedding model with flexible vector dimensions
- Storage-constrained knowledge bases: The 512-dimension option retains about 99% accuracy at lower storage cost
- Multilingual document collections: A single embedding model covers 100+ languages without separate per-language models
- Bedrock-integrated search: Semantic search and document reranking workflows integrated with Amazon Bedrock
- General-purpose code search: Code search and classification tasks that benefit from a dense embedding rather than a code-specific model
Consider Alternatives When
- Multimodal embedding requirements: Amazon Nova Multimodal Embeddings may be more appropriate for text, image, and video in the same vector space
- Code-specific retrieval: A model tuned specifically for code retrieval outperforms general-purpose text embeddings
- Longer input token limit: English-only retrieval corpora may benefit from an embedding model with a larger token input capacity
Conclusion
Titan Text Embeddings V2 lets you trade storage and latency against accuracy by picking vector width, with normalization tuned for typical RAG similarity. It fits Bedrock-centric RAG stacks you already run through AI Gateway.
Frequently Asked Questions
What embedding dimensions are available?
256, 512, or 1024. The default is 1024. Smaller dimensions reduce storage and index size. The 512-d option retains roughly 99% accuracy versus 1024-d, and 256-d retains roughly 97%.
What is the maximum input length?
Up to 8,192 tokens or roughly 50,000 characters per input string. For retrieval tasks, segment documents into logical paragraphs or sections rather than using the full token budget per chunk.
Does the model support multilingual inputs?
Yes. Titan Text Embeddings V2 supports 100+ languages, including Arabic, Chinese, French, German, Hindi, Japanese, Korean, Russian, and Spanish. You can mix languages in one index when your retrieval evaluations stay strong enough.
Should I normalize embeddings for RAG use cases?
Yes. Normalization improves cosine similarity accuracy when comparing query and document embeddings.
Can I mix vector dimensions in the same index?
No. All vectors in a given index must share the same dimension. Choose your dimension before initial indexing, as changing it requires reprocessing and reindexing all stored vectors.
How is the model priced?
This page lists the current rates. Multiple providers can serve Titan Text Embeddings V2, so AI Gateway surfaces live pricing rather than a single fixed figure.
What use cases is V2 optimized for compared to V1?
V2 is optimized for RAG, multilingual retrieval, and code embedding. It uses a 1024-dimension default versus V1's 1536 dimensions, which cuts vector storage with strong retrieval accuracy at the tested dimensions.