Grok Imagine
Grok Imagine is xAI's video generation model. It creates video clips from text prompts and images with motion, generated audio, and lip-sync, available through Vercel AI Gateway.
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({ model: 'xai/grok-imagine-video', prompt: 'A serene mountain lake at sunrise.'});Playground
Try out Grok Imagine by xAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
About Grok Imagine
Grok Imagine is xAI's video generation model, released January 28, 2026 and available through Vercel AI Gateway. It generates video clips from text descriptions and static images with motion, instruction following, and support for complex prompts and follow-up instructions to refine scenes.
The model supports three primary generation modes: text-to-video (creating clips from text descriptions), image-to-video (generating motion from static images), and video editing (modifying existing video content through style changes, object replacement, and scene alterations). It also generates audio timed to the video with lip-sync, so you can skip separate voice recording for many workflows. Grok Imagine produces short clips quickly enough for iterative creative workflows. You can call it from the AI SDK's generateVideo function, the AI Gateway playground at https://ai-sdk.dev/playground/xai:grok-imagine-video, or the v0 Grok Creative Studio. Video generation is currently available to Pro and Enterprise plan subscribers and paid AI Gateway users.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.
More models by xAI
| Model |
|---|
What To Consider When Choosing a Provider
- Configuration: Video generation is currently limited to Pro and Enterprise plans, as well as paid AI Gateway users. Verify your plan supports video generation before integrating.
- Configuration: Grok Imagine understands follow-up instructions to tweak scenes. Use iterative prompting to refine output rather than trying to get the perfect result in a single generation.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Grok Imagine
Best For
- Marketing and social media video content: Where custom clips are needed at scale without traditional video production
- Product demos and explainer videos: That visualize concepts, features, or workflows through generated scenes
- Creative prototyping and storyboarding: Teams iterate on visual concepts before committing to full production
- Content creation pipelines: That generate short video assets programmatically for personalization or A/B testing
- Lip-synced video with native audio: For talking-head content, presentations, or character-driven narratives
Consider Alternatives When
- Static image generation: Grok Imagine Image or Grok Imagine Image Pro handles the task without video overhead
- Long-form video production: Traditional editing tools provide more control over extended content
- Free-tier usage: Video generation currently requires a paid plan
Conclusion
Grok Imagine brings AI video generation into the Vercel AI Gateway ecosystem. It supports text-to-video, image-to-video, video editing, and audio in one pipeline. Iterative prompting and short clip latency fit production workflows that need custom video without full traditional production.
Frequently Asked Questions
What generation modes does Grok Imagine support?
Three modes: text-to-video (creating clips from descriptions), image-to-video (animating static images), and video editing (modifying existing videos through style changes, object replacement, and scene alterations).
Does Grok Imagine generate audio?
Yes. Grok Imagine generates audio in the clip with lip-sync, so you often don't need separate voice recording or dubbing.
How fast does Grok Imagine generate video?
Generation is fast enough for iteration in most setups; exact time depends on length, resolution, and load. Expect short clips, not long renders.
What plans support video generation?
Pro and Enterprise plan subscribers and paid AI Gateway users.
How do I authenticate with Grok Imagine through Vercel AI Gateway?
Use your Vercel AI Gateway API key with
xai/grok-imagine-videoas the model identifier. You can integrate through the AI SDK's generateVideo function, the AI Gateway playground at https://ai-sdk.dev/playground/xai:grok-imagine-video, or the v0 Grok Creative Studio.Can I refine generated videos with follow-up prompts?
Yes. Grok Imagine understands follow-up instructions to tweak scenes, adjust styles, and modify content. Use iterative prompting to refine output.
Does Vercel AI Gateway support Zero Data Retention for Grok Imagine?
Zero Data Retention is not currently available for this model. ZDR on AI Gateway applies to direct gateway requests; BYOK flows aren't covered. See https://vercel.com/docs/ai-gateway/capabilities/zdr for details.