Voyage Law 2
Voyage Law 2 is Voyage AI's legal-specialized embedding model trained on one trillion legal tokens. It outperforms OpenAI text-embedding-3-large by 6% across eight legal datasets and achieves 84.44 NDCG@10 on long-context legal retrieval versus 68.40 for OpenAI.
import { embed } from 'ai';
const result = await embed({ model: 'voyage/voyage-law-2', value: 'Sunny day at the beach',})About Voyage Law 2
Voyage Law 2 is Voyage AI's legal-specialized embedding model, released March 1, 2024. Voyage AI trained it on one trillion high-quality legal tokens using specifically designed positive pairs and a contrastive learning algorithm. The model handles diverse legal content including contracts, congressional bills, court cases, and statutes across multiple jurisdictions: U.S., Chinese, German, and Indian.
Across eight legal retrieval datasets, Voyage Law 2 outperforms OpenAI text-embedding-3-large by 6% on average, with improvements exceeding 10% on LeCaRDv2, LegalQuAD, and GerDaLIR. On long-context legal retrieval, Voyage Law 2 achieves 84.44 NDCG@10 compared to 68.40 for OpenAI. That's a 23% relative improvement reflecting the model's strength on lengthy legal documents.
Voyage AI intentionally mixed legal training data with finance, technology, and intellectual property domains. This ensures Voyage Law 2 performs well on non-legal retrieval tasks while maintaining its legal specialization. Teams with mixed legal and business content don't need a separate general-purpose model for non-legal documents in the same index.