Qwen3 Max

Qwen3 Max is Alibaba's trillion-parameter MoE language model with a context window of 262.1K tokens, delivering competitive performance on coding, mathematics, and enterprise tool-use tasks.

Tool UseImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'alibaba/qwen3-max',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

About Qwen3 Max

Qwen3 Max is the largest model in Alibaba's Qwen3 line, built on a mixture-of-experts (MoE) architecture with over one trillion total parameters. The MoE design allocates computation selectively, enabling performance without activating the full parameter count on every token.

The context window of 262.1K tokens makes it practical for tasks that earlier-generation models had to split across multiple calls: ingesting entire codebases, indexing long legal or financial documents, or tracking dependencies across extended multi-turn conversations. Context caching further reduces the cost of repeatedly processing the same long prefix.

Qwen3 Max performs strongly on structured-output and tool-use benchmarks, recording 74.8 on Tau2-Bench and 79.3% accuracy on LiveBench. On software engineering tasks measured by SWE-bench Verified, Qwen3 Max scored 69.6. These results reflect a consistent emphasis on reliability for enterprise tasks: JSON generation, HTML/CSS formatting, API function calling, and multi-step agentic workflows where predictable output structure matters.

Alibaba positions Qwen3 Max with native bilingual strength in Chinese and English, alongside broad multilingual support. The model is available via API only. Weights aren't publicly released.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Qwen3 Max

About Qwen3 Max