Skip to content

Nvidia Nemotron Nano 9B V2

Nvidia Nemotron Nano 9B V2 is a dense hybrid Mamba-Transformer reasoning model that matches or exceeds Qwen3-8B accuracy at up to 6x the throughput, with built-in thinking budget control.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'nvidia/nemotron-nano-9b-v2',
prompt: 'Why is the sky blue?'
})

More models by NVIDIA

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
256K
0.2s
154tps
$0.15/M$0.65/M
bedrock logo
03/18/2026
131K
0.2s
$0.20/M$0.60/M
bedrock logo
deepinfra logo
12/01/2024
262K
0.2s
98tps
$0.05/M$0.24/M
deepinfra logo
12/01/2024