Skip to content
Vercel April 2026 security incident

Titan Text Embeddings V2

amazon/titan-embed-text-v2

Titan Text Embeddings V2 is Amazon's text embedding model tuned for retrieval-augmented generation (RAG). Vectors only; no output token charge. You can choose 256-, 512-, or 1024-dimensional output vectors, with support for 100+ languages.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'amazon/titan-embed-text-v2',
value: 'Sunny day at the beach',
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

Select your output dimension before deploying to production. Once vectors are stored in your vector database at a given dimension, changing the size requires reindexing your entire corpus.

When to Use Titan Text Embeddings V2

Best For

  • Cost-efficient RAG pipelines:

    High-accuracy embedding model with flexible vector dimensions

  • Storage-constrained knowledge bases:

    The 512-dimension option retains about 99% accuracy at lower storage cost

  • Multilingual document collections:

    A single embedding model covers 100+ languages without separate per-language models

  • Bedrock-integrated search:

    Semantic search and document reranking workflows integrated with Amazon Bedrock

  • General-purpose code search:

    Code search and classification tasks that benefit from a dense embedding rather than a code-specific model

Consider Alternatives When

  • Multimodal embedding requirements:

    Amazon Nova Multimodal Embeddings may be more appropriate for text, image, and video in the same vector space

  • Code-specific retrieval:

    A model tuned specifically for code retrieval outperforms general-purpose text embeddings

  • Longer input token limit:

    English-only retrieval corpora may benefit from an embedding model with a larger token input capacity

Conclusion

Titan Text Embeddings V2 lets you trade storage and latency against accuracy by picking vector width, with normalization tuned for typical RAG similarity. It fits Bedrock-centric RAG stacks you already run through AI Gateway.

FAQ

256, 512, or 1024. The default is 1024. Smaller dimensions reduce storage and index size. The 512-d option retains roughly 99% accuracy versus 1024-d, and 256-d retains roughly 97%.

Up to 8,192 tokens or roughly 50,000 characters per input string. For retrieval tasks, segment documents into logical paragraphs or sections rather than using the full token budget per chunk.

Yes. Titan Text Embeddings V2 supports 100+ languages, including Arabic, Chinese, French, German, Hindi, Japanese, Korean, Russian, and Spanish. You can mix languages in one index when your retrieval evaluations stay strong enough.

Yes. Normalization improves cosine similarity accuracy when comparing query and document embeddings.

No. All vectors in a given index must share the same dimension. Choose your dimension before initial indexing, as changing it requires reprocessing and reindexing all stored vectors.

This page lists the current rates. Multiple providers can serve Titan Text Embeddings V2, so AI Gateway surfaces live pricing rather than a single fixed figure.

V2 is optimized for RAG, multilingual retrieval, and code embedding. It uses a 1024-dimension default versus V1's 1536 dimensions, which cuts vector storage with strong retrieval accuracy at the tested dimensions.