Qwen3 Embedding 4B

Qwen3 Embedding 4B is a mid-tier 4-billion-parameter text embedding model producing 2560-dimensional vectors over a context of 32.8K tokens, designed for multilingual semantic search and code retrieval that balances quality with operational cost.

import { embed } from 'ai';

const result = await embed({
  model: 'alibaba/qwen3-embedding-4b',
  value: 'Sunny day at the beach',
})

Overview About Providers Throughput Latency Similar FAQ

Latency24 hours

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. See the docs for more information.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Qwen3 Embedding 4B