Qwen3 Next 80B A3B Instruct

Qwen3 Next 80B A3B Instruct is an 80-billion-parameter hybrid Transformer-Mamba model that activates only 3B parameters per token, delivering 10x inference throughput over dense alternatives at a native context window of 262.1K tokens.

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'alibaba/qwen3-next-80b-a3b-instruct',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

About Qwen3 Next 80B A3B Instruct

Qwen3 Next 80B A3B Instruct introduces a Hybrid Transformer-Mamba architecture that alternates between Gated DeltaNet (a linear attention mechanism) and standard Gated Attention within a 48-layer, 512-expert MoE stack. The layout follows a 12-block repeating pattern: three Gated DeltaNet + MoE layers followed by one Gated Attention + MoE layer. This design is purpose-built for ultra-long-context efficiency: linear attention handles the vast majority of layers at sub-quadratic cost, while sparse Gated Attention layers maintain the precision needed for complex cross-token reasoning.

With only 10 of 512 experts activated per token (plus one shared expert), Qwen3 Next 80B A3B Instruct achieves a 3.75% activation ratio. Combined with Multi-Token Prediction during inference, this translates to approximately 10x higher throughput over comparable 32B dense models on sequences of 32K tokens or longer, a meaningful operational advantage for workloads that process long documents or transcripts at scale. The Instruct variant is tuned for direct instruction following and doesn't generate thinking traces (that variant is Qwen3-Next-80B-A3B-Thinking).

On the 1M RULER benchmark for extreme-length context, Qwen3 Next 80B A3B Instruct scores 80.3% accuracy, and its context of 262.1K tokens is extensible to approximately one million tokens via YaRN rope scaling. On knowledge benchmarks, it scores 80.6 on MMLU-Pro and 82.7 on Arena-Hard v2, tracking competitively with models that require far more compute per token.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Qwen3 Next 80B A3B Instruct

About Qwen3 Next 80B A3B Instruct