Qwen3 Embedding 8B is purpose-built for retrieval workloads where accuracy is paramount. Qwen3 Embedding 8B ranked first on the MTEB multilingual leaderboard with a score of 70.58 at release. Its 4096-dimensional output space encodes fine-grained semantic distinctions that smaller embedding models flatten.
The architecture employs 36 transformer layers and derives from the Qwen3 foundation model. The resulting embeddings generalize across both in-domain and out-of-domain retrieval scenarios.
Coverage spans more than 100 natural languages and multiple programming languages, enabling truly multilingual vector indexes where documents in French, Japanese, or Python code can be searched using queries in any supported language. Matryoshka Representation Learning lets operators shorten vectors at inference time, helpful for tiered index architectures where a coarse first-pass retrieval uses short vectors and a reranking stage uses full-resolution representations.