Skip to content
Dashboard

Qwen 3 Coder 30B A3B Instruct

Qwen 3 Coder 30B A3B Instruct is a compact mixture-of-experts coding model from Alibaba, activating only 3 billion parameters per inference while delivering strong agentic coding performance for cost-sensitive deployments.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'alibaba/qwen3-coder-30b-a3b',
prompt: 'Why is the sky blue?'
})

Playground

Try out Qwen 3 Coder 30B A3B Instruct by Alibaba. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

alibaba logo
alibaba logo

Ask Qwen 3 Coder 30B A3B Instruct anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Amazon Bedrock
262K
0.2s
$0.15/M$0.60/M——
07/31/2025
Novita AI
160K
1.2s
120tps
$0.07/M$0.27/M——
07/31/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Alibaba

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.8s
370tps
$0.32/M$1.28/M
Read:$0.08/M
Write:$0.5/M
——
+3
alibaba logo
fireworks logo
togetherai logo
06/02/2026
991K
3.4s
55tps
$1.25/M$3.75/M
Read:$0.25/M
Write:$1.56/M
——
+1
alibaba logo
05/21/2026
1M
0.8s
109tps
$0.50/M
$3.00/M
Read:
$0.1/M
Write:
$0.63/M
——
+3
alibaba logo
fireworks logo
togetherai logo
04/02/2026
1M
3.0s
61tps
$0.10/M$0.40/M
Read:$0.0/M
Write:$0.13/M
——
+3
alibaba logo
02/24/2026
262K
0.5s
59tps
$0.20/M$0.88/M
Read:$0.11/M
Write:—
——
+1
alibaba logo
deepinfra logo
09/23/2025
33K
$0.05/M——
deepinfra logo
06/05/2025

About Qwen 3 Coder 30B A3B Instruct

Qwen 3 Coder 30B A3B Instruct is the accessible tier of the Qwen3-Coder family. The "30B-A3B" naming convention is explicit: 30 billion total parameters in the MoE architecture, with 3 billion activated during inference. That 10:1 ratio between stored and active capacity is the model's defining characteristic: broad capacity at 30B scale combined with 3B-equivalent serving costs.

Like its larger sibling, Qwen 3 Coder 30B A3B Instruct was developed within the Qwen3-Coder framework, carrying the same coding-first orientation: deep familiarity with programming languages, patterns, and developer workflows, paired with tuning for real-world coding tasks rather than just benchmark patterns.

The 3B active parameter count translates to meaningfully faster inference than dense models of comparable quality, which matters for interactive development tools where the model is invoked frequently.

Agentic capabilities (multi-turn tool use, plan-execute-debug iteration, and environment interaction) are present in this variant given its origin in the Qwen3-Coder lineage. Teams building coding assistants, automated review pipelines, or developer-facing products can use Qwen 3 Coder 30B A3B Instruct as a cost-effective foundation without sacrificing the core agentic orientation that distinguishes the Qwen3-Coder family from general-purpose models.

What To Consider When Choosing a Provider

  • Configuration: The 30B total / 3B active parameter structure keeps serving costs tractable, worth factoring in when you're comparing tiers within the Qwen3-Coder family.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Qwen 3 Coder 30B A3B Instruct

Best For

  • Cost-sensitive agentic coding deployments: When you need a model that understands code at a meaningful depth and can handle multi-step workflows, but the per-token cost of the 480B-A35B variant isn't justified by your use case or volume, the 30B-A3B offers a practical alternative
  • Interactive coding tools with latency requirements: The 3B active parameter count yields faster token generation than larger dense or MoE models. For coding assistants embedded in editors or IDEs where response time affects user experience, this matters
  • High-frequency automated code tasks: CI/CD pipelines, automated PR description generation, code review summarization, and similar high-volume tasks are served well by a capable but economical model

Consider Alternatives When

  • The task requires the highest coding capability: For the most complex repository-level engineering problems, multi-file refactors with subtle dependencies, or tasks where getting it right the first time is critical, the larger Qwen3-Coder variant offers a higher performance ceiling
  • General knowledge and reasoning matter as much as code: This model is optimized for coding scenarios. Tasks that blend heavy general-domain reasoning with code may perform better on a general-purpose Qwen3 model of equivalent or larger size
  • Extremely long context is required: Verify the context window (262.1K tokens) against your specific use case, particularly for agentic tasks that accumulate long tool-call histories

Conclusion

Qwen 3 Coder 30B A3B Instruct carves out the practical middle ground in agentic coding: enough code intelligence and multi-step reasoning for real software engineering tasks, at inference costs that make high-volume deployment financially viable. Through AI Gateway, the operational complexity of managing multiple provider relationships collapses into a single endpoint with built-in reliability.