Voyage Code 3

Voyage Code 3 is Voyage AI's code-specialized embedding model with a context window of 0 tokens, 300+ programming language support, and Matryoshka dimensionality. It outperforms OpenAI text-embedding-3-large by 13.80% on code retrieval across 32 datasets.

index.ts

import { embed } from 'ai';

const result = await embed({
  model: 'voyage/voyage-code-3',
  value: 'Sunny day at the beach',
})

Overview About Providers Similar FAQ

About Voyage Code 3

Voyage Code 3 is Voyage AI's code-specialized embedding model, released December 4, 2024. It supports a context window of 0 tokens and produces embeddings in four dimensions: 2048, 1024, 512, and 256. Voyage AI trained it on trillions of tokens combining text, code, and mathematical content plus real-world query-code pairs from GitHub repositories. It covers over 300 programming languages.

Across 32 code retrieval datasets, Voyage Code 3 outperforms OpenAI text-embedding-3-large by 13.80% and CodeSage-large by 16.81%. At 1024 dimensions, it retains 92.28% of its full-precision quality, compared to 77.64% for OpenAI at the same dimension. This makes dimension reduction particularly effective for cost or latency optimization.

Quantization-aware training supports 32-bit float, int8, uint8, binary, and unsigned binary formats. Binary embeddings at 256 dimensions still outperform OpenAI text-embedding-3-large by 4.81% while using 1/384th the storage of 3072-dimensional float embeddings. These compression options make Voyage Code 3 practical for very large codebases where millions of files need indexing.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Voyage Code 3

About Voyage Code 3