Skip to content

Kling v3.0 Image-to-Video

Kling v3.0 Image-to-Video is a v3.0-generation Kling image-to-video model with first/last frame control, physics-aware motion, native audio, and up to 1080p output at durations up to 15 seconds.

image-to-videomulti-shotaudio-generation
index.ts
import { experimental_generateVideo as generateVideo } from 'ai';
const result = await generateVideo({
model: 'klingai/kling-v3.0-i2v',
prompt: 'A serene mountain lake at sunrise.'
});

Frequently Asked Questions

  • How long can output videos be with Kling v3.0 Image-to-Video?

    Up to 15 seconds, extended from the 10-second maximum in earlier Kling versions.

  • Can I define both the first and last frame of the generated video?

    Yes. You can supply a first-frame image, a last-frame image, or both. The model generates motion between the two endpoints.

  • Does Kling v3.0 Image-to-Video include audio generation?

    Yes. Native audio generation (speech, sound effects, and ambient audio) is included in the v3.0 generation tier.

  • What is the difference between v3.0 i2v and v2.6 i2v?

    V3.0 extends maximum duration to 15 seconds, improves physics-aware motion, and includes the full v3 quality tier. V2.6 introduced audio generation but operates at the v2 quality level with a 10-second maximum.

  • What resolution does Kling v3.0 Image-to-Video support?

    Up to 1080p at 16:9, 9:16, and 1:1. Select Pro mode on the provider when you need 1080p output.

  • Is Kling v3.0 Image-to-Video generally available on AI Gateway?

    Yes, for Pro and Enterprise plans and paid AI Gateway users while video generation stays in beta. Recheck AI Gateway access notes before you rely on it in production.