Skip to content
Dashboard

Voyage Code 2

Voyage Code 2 is Voyage AI's code-specialized embedding model with a context window of 0 tokens. It improves code retrieval by 14.52% over OpenAI text-embedding-3-large and supports Python, C++, Java, and major ML framework documentation.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-code-2',
value: 'Sunny day at the beach',
})

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Voyage AI
$0.12/M——
01/01/2024

More models by Voyage AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$0.06/M——
voyage logo
01/15/2026
32K
$0.02/M——
voyage logo
01/15/2026
32K
$0.12/M——
voyage logo
01/15/2026
32K
$0.05/M——
voyage logo
08/11/2025
32K
$0.02/M——
voyage logo
08/11/2025
$0.06/M——
voyage logo
05/20/2025

About Voyage Code 2

Voyage Code 2 is Voyage AI's code-specialized embedding model, released January 1, 2024. It features a context window of 0 tokens and targets code retrieval, code completion, and code assistant applications. On code retrieval tasks across 11 datasets derived from HumanEval, APPS, MBPP, DS-1000, CodeChef, and LeetCode, Voyage Code 2 achieves a 14.52% improvement in recall@5 over OpenAI text-embedding-3-large.

Voyage Code 2 also performs well on general-purpose text retrieval, exceeding OpenAI text-embedding-3-large by 3.03% and Cohere Embed v3 by 4.93%. You can use a single embedding model for both code and documentation retrieval rather than maintaining separate indices with different models.

Voyage AI evaluates it on Python, C++, and Java, plus documentation and usage patterns for Matplotlib, NumPy, Pandas, PyTorch, SciPy, scikit-learn, and TensorFlow. The model handles both natural language queries searching for code (text-to-code) and code snippets searching for similar code (code-to-code).

What To Consider When Choosing a Provider

  • Configuration: Voyage Code 2 targets code search. If you're embedding source code, function signatures, and documentation for retrieval, it outperforms general-purpose embedding models by a wide margin.
  • Configuration: Voyage AI released voyage-code-3, which supports 300+ programming languages, a 32K context window, and Matryoshka dimensionality. Use voyage-code-3 for new deployments unless you need compatibility with existing Voyage Code 2 indices.
  • Configuration: Despite its code focus, Voyage Code 2 outperforms several general-purpose models on standard text retrieval. Use it for mixed code-and-documentation corpora without a second model.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Voyage Code 2

Best For

  • Code search engines: Retrieve relevant functions, classes, or modules from natural language queries
  • Code completion pipelines: Retrieval-augmented generation finds similar code patterns
  • Developer documentation search: API references, library docs, and code examples
  • Mixed code and text retrieval: A single model handles both source code and natural language documentation
  • ML framework documentation: Retrieval for Python-centric data science and machine learning workflows

Consider Alternatives When

  • You need broader language coverage: Voyage-code-3 supports 300+ programming languages beyond Python, C++, and Java
  • You need a longer context window: Voyage-code-3 offers 32K tokens versus Voyage Code 2's 0 tokens
  • Your workload is general-purpose text with no code: A general-purpose embedding model like voyage-3.5 fits better
  • You need Matryoshka dimensionality: Voyage-code-3 supports 2048/1024/512/256 dimensions for flexible sizing

Conclusion

Voyage Code 2 delivers a 14.52% code retrieval improvement over OpenAI text-embedding-3-large. If you have existing Voyage Code 2 indices, you can keep them and avoid a re-embed. For new deployments, use voyage-code-3 for its broader language coverage, longer context window, and Matryoshka dimensionality. Route requests through AI Gateway for unified access.