Skip to content

Voyage Law 2

Voyage Law 2 is Voyage AI's legal-specialized embedding model trained on one trillion legal tokens. It outperforms OpenAI text-embedding-3-large by 6% across eight legal datasets and achieves 84.44 NDCG@10 on long-context legal retrieval versus 68.40 for OpenAI.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'voyage/voyage-law-2',
value: 'Sunny day at the beach',
})

Frequently Asked Questions

  • What legal document types does Voyage Law 2 handle?

    Contracts, congressional bills, court cases, and statutes. It is trained and evaluated across U.S., Chinese, German, and Indian legal content.

  • How does Voyage Law 2 perform on long legal documents?

    Voyage Law 2 achieves 84.44 NDCG@10 on long-context legal retrieval, compared to 68.40 for OpenAI text-embedding-3-large. This 23% relative improvement reflects its strength on the lengthy documents common in legal work.

  • Should I use Voyage Law 2 or voyage-3-large for legal retrieval?

    Voyage AI's voyage-3-large now outperforms domain-specific models on legal benchmarks. For mixed legal and non-legal content, voyage-3-large or voyage-3.5 may be simpler. Use Voyage Law 2 if your workload is exclusively legal and you have existing indices.

  • Does Voyage Law 2 work for non-legal content?

    Yes. Voyage AI intentionally trained Voyage Law 2 on mixed legal, financial, technical, and intellectual property content. It performs well on non-legal retrieval tasks while maintaining its legal specialization.

  • What jurisdictions does Voyage Law 2 cover?

    U.S., Chinese, German, and Indian legal systems. The model handles cross-jurisdictional retrieval within a single vector index.

  • How do I access Voyage Law 2 through Vercel AI Gateway?

    Add your Voyage AI API key in AI Gateway settings, then send embedding requests through AI Gateway. AI Gateway authenticates requests and records usage. You can combine Voyage Law 2 with other embedding models across providers.

  • How much legal training data was used for Voyage Law 2?

    One trillion high-quality legal tokens, with specifically designed positive pairs and a contrastive learning algorithm. The training data spans contracts, legislation, and case law across multiple jurisdictions.