Nova 2 Lite
Nova 2 Lite is a second-generation multimodal reasoning model with a context window of 1M tokens. It supports configurable extended thinking, web grounding, and code execution at a cost tier built for everyday production use.
import { streamText } from 'ai'
const result = streamText({ model: 'amazon/nova-2-lite', prompt: 'Why is the sky blue?'})Playground
Try out Nova 2 Lite by Amazon. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.
More models by Amazon
| Model |
|---|
About Nova 2 Lite
Nova 2 Lite is the cost-focused model in Amazon's second-generation Nova family. The context window moves from 300K tokens on first-gen Nova Lite and Nova Pro up to 1M tokens. Extended thinking lets the model run deliberate multi-step reasoning before it answers. You set it to low, medium, or high. It stays off by default when you don't need it.
Amazon pairs Nova 2 Lite and Nova 2 Pro with built-in web grounding and code execution: the model can pull current public information and run code so answers stay tied to fresh data, not only to training cutoffs.
Inputs include text, images, video, and documents (including PDFs) within the window of 1M tokens. That covers long documents with images, video with transcripts, and similar mixed workloads that first-generation Nova models handle with tighter limits.
What To Consider When Choosing a Provider
- Configuration: When extended thinking is on, reasoning tokens are billed at the output token rate. If you use medium or high thinking budgets often, monitor output token spend in the AI Gateway cost dashboard.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Nova 2 Lite
Best For
- Mixed-format document processing: Combine PDFs, images, and text in a single long-context request
- Current information workflows: Web grounding provides up-to-date information with citations when the provider returns them
- Agentic task orchestration: The model calls external tools across multiple steps
- Selective extended thinking: Toggle deeper reasoning on analytical tasks and use fast extraction by default
- Unified video and document analysis: Video and image analysis combined with document comprehension in a single context of 1M tokens
Consider Alternatives When
- Text-only cost-focused workloads: Nova Micro remains cheaper and faster for pure text processing
- Simple multimodal tasks: First-generation Nova Lite handles simpler vision tasks at lower cost without reasoning overhead
- Highest structured-document accuracy: Evaluate Nova Pro when second-gen reasoning features aren't required
Conclusion
Nova 2 Lite pairs a large multimodal context with optional extended thinking, web grounding, and code execution. You pay extra reasoning tokens only when you turn thinking on. Teams that hit limits on first-generation Nova Lite often move here before jumping to the highest-cost tiers.
Frequently Asked Questions
How do I control extended thinking?
Under
providerOptions.bedrock.reasoningConfig, setmaxReasoningEfforttolow,medium, orhigh. Extended thinking is off by default, so you avoid reasoning-token charges until you enable it.What does web grounding do?
The model can search the public web during a request and ground answers in what it finds, with citations when the provider returns them. Use it when answers need current facts, not only the training snapshot.
How large is the context window compared to first-gen Nova models?
1M tokens. That's more than three times the 300K-token context in Nova Lite and Nova Pro, and nearly eight times Nova Micro's 128K.
Can Nova 2 Lite run code during a conversation?
Yes. Amazon documents code execution for Nova 2 models so the model can run code as part of the response flow. Use it for calculations, data transforms, and similar tasks the provider supports.
How are reasoning tokens billed?
Reasoning tokens are billed at the output token rate. With thinking disabled (the default), you pay only standard input and output costs.
Is Nova 2 Lite a direct replacement for first-gen Nova Lite?
No, it's a successor but not a drop-in replacement. Nova 2 Lite has different pricing and a different capability profile. For simple multimodal tasks where first-gen Nova Lite performs well, upgrading may not cut costs. The value lies in the reasoning, grounding, and code execution features the original lacked.