Skip to content
Vercel April 2026 security incident

Nova Micro

amazon/nova-micro

Nova Micro delivers text-only inference at high throughput with per-token pricing below multimodal Nova models in the same generation, purpose-built for latency-sensitive applications at scale.

index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'amazon/nova-micro',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

If streaming throughput matters, test Nova Micro early. It's built for high text throughput within AI Gateway.

When to Use Nova Micro

Best For

  • Chatbots and conversational interfaces:

    Interactive experiences where response speed affects UX

  • Text classification at scale:

    Sentiment analysis and entity extraction across high request volumes

  • Autocomplete and suggestions:

    Inline features where generation speed matters

  • Default routing tier:

    Handle routine text requests at minimal cost in a model-routing architecture

Consider Alternatives When

  • Multimodal inputs required:

    Switch to Nova Lite or Nova Pro when inputs include images, documents, or video

  • Structured multi-step reasoning:

    Nova 2 Lite is better equipped for agentic tool use and deeper reasoning

  • Context beyond 128K tokens:

    Long documents or extended conversations require a model with a larger context window

Conclusion

Nova Micro is the speed specialist of the Nova family. By focusing exclusively on text, it achieves throughput and pricing that suit many text-only workloads where response latency matters. Pair it with a routing layer that escalates to multimodal or reasoning-capable models when needed.

FAQ

Nova Micro is priced below Nova Lite's multimodal rate and is tuned for speed on pure text. If you never send images or video, Micro is usually the cheaper fit.

Keep the prompt within 128K tokens. If you exceed that, split the document, summarize in chunks, or switch to Nova 2 Lite for a 1M-token window.

Yes. It follows instructions well for classification, tagging, and structured extraction. Its speed makes it especially efficient for pipelines that process many short requests.

Nova Micro generates up to 8.2K tokens per response.

Nova Micro isn't designed for complex reasoning. It excels at speed and cost efficiency for routine language tasks. For reasoning-heavy workloads, consider Nova 2 Lite or Nova Pro.

No. AI Gateway handles authentication with Amazon Bedrock. You only need a gateway API key or OIDC token.