Nova Micro

Nova Micro delivers text-only inference at high throughput with per-token pricing below multimodal Nova models in the same generation, purpose-built for latency-sensitive applications at scale.

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'amazon/nova-micro',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Latency Uptime Status Similar FAQ

About Nova Micro

Nova Micro launched alongside the rest of the first-generation Nova family, but its design philosophy differs. Nova Lite and Nova Pro layer in image and video understanding. Nova Micro drops multimodal support entirely. The result is a model that does one thing, text processing, at high speed and low cost.

The tradeoff is intentional. Removing vision processing frees up architecture for generation throughput within a context window of 128K tokens. Even high-volume classification or tagging pipelines have low per-request costs.

Many teams deploy Nova Micro as the default tier in a routing architecture. Straightforward text requests (classification, entity extraction, simple Q&A, short summaries) go to Micro. Only when a request involves images, requires deep reasoning, or exceeds the context of 128K tokens does it escalate to Lite, Pro, or a second-generation model. This pattern keeps average cost per request low while covering the full range of task complexity.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Nova Micro

About Nova Micro