NVIDIA Nemotron 3 Super 120B A12B

NVIDIA Nemotron 3 Super 120B A12B is NVIDIA's 120B total, 12B active-parameter hybrid Mamba-Transformer MoE built for complex multi-agent applications, featuring latent MoE and multi-token prediction.

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'nvidia/nemotron-3-super-120b-a12b',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

More models by NVIDIA

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

131K

0.2s

153tps

$0.06/M

$0.23/M

—

08/18/2025

262K

0.2s

82tps

$0.05/M

$0.24/M

—

12/01/2024

131K

0.2s

$0.20/M

$0.60/M

—

12/01/2024

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

NVIDIA Nemotron 3 Super 120B A12B

More models by NVIDIA