Skip to content

Qwen3 Embedding 4B

Qwen3 Embedding 4B is a mid-tier 4-billion-parameter text embedding model producing 2560-dimensional vectors over a context of 32.8K tokens, designed for multilingual semantic search and code retrieval that balances quality with operational cost.

index.ts
import { embed } from 'ai';
const result = await embed({
model: 'alibaba/qwen3-embedding-4b',
value: 'Sunny day at the beach',
})
Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.