FLUX.2 [klein] 9B

FLUX.2 [klein] 9B is Black Forest Labs's quality-focused small image model in the Klein family. It pairs a 9B flow model with an 8B Qwen3 text embedder and distills inference to four steps for sub-half-second latency.

Image Gen

index.ts

import { experimental_generateImage as generateImage } from 'ai';

const result = await generateImage({
  model: 'bfl/flux-2-klein-9b',
  prompt: 'A red balloon on a wooden table.'
});

Overview About Providers Similar FAQ

Playground

Try out FLUX.2 [klein] 9B by Black Forest Labs. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

FLUX.2 [klein] 9B

Prompt

Describe what you want the model to generate.

Reference images(optional)

Add up to 4 images

Images to generate

Your generated image will appear here

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Input	Output	Cache	Web Search	Capabilities	ZDR	No Training	Release Date

Black Forest Labs

Legal:Terms

•

Privacy

$0.015/MP

—

01/15/2026

More models by Black Forest Labs

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Capabilities	Providers	ZDR	No Training	Release Date

bfl/flux-2-klein-4b

$0.014/MP

—

01/15/2026

bfl/flux-2-max

67K

$0.07/MP

—

12/16/2025

bfl/flux-2-flex

$0.06/MP

—

11/25/2025

bfl/flux-kontext-max

512

$0.08/img

—

05/29/2025

bfl/flux-kontext-pro

512

$0.04/img

—

05/29/2025

bfl/flux-pro-1.0-fill

$0.05/img

—

10/01/2024

About FLUX.2 [klein] 9B

FLUX.2 [klein] 9B sits at the quality-focused end of Black Forest Labs's Klein family, released on January 15, 2026. Black Forest Labs describes it as setting the Pareto frontier for quality versus latency across text-to-image, single-reference editing, and multi-reference generation.

The architecture starts with the text encoder. Klein 9B pairs the flow model with an 8B Qwen3 embedder, a large-language-model-class text encoder that brings substantially better prompt understanding than smaller CLIP-based encoders. This Qwen3 integration is the primary reason Klein 9B understands complex, multi-clause prompts more accurately than competing small models.

The 9B flow model is step-distilled to four inference steps. Step distillation compresses the full diffusion trajectory into a small number of high-quality denoising passes. The result is a model that generates images in under half a second while retaining the quality of a much larger, slower undistilled counterpart.

Klein 9B handles text-to-image synthesis, image editing with a single reference, and multi-reference generation (combining multiple input images to blend concepts) within the same weights.

What To Consider When Choosing a Provider

Configuration: Klein 9B uses the FLUX Non-Commercial License (NCL). If you're building a commercial production application, consider Klein 4B (Apache 2.0) or FLUX.2 Pro instead. See N/A for current API pricing.
Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use FLUX.2 [klein] 9B

Best for

Quality-latency sweet spot: Klein 9B fits when Klein 4B quality falls short but FLUX.2 Pro latency is too high
Multi-reference generation: Workflows that combine multiple concept images into a coherent output at interactive speed
Agentic visual systems: Sub-second latency enables rapid iteration loops that generate and evaluate images many times per session
Prompt-heavy workflows: Complex, descriptive instructions benefit from the Qwen3 text embedder's language understanding

Consider alternatives when

Unrestricted commercial use: Klein 9B uses the FLUX Non-Commercial License, so choose Klein 4B (Apache 2.0) or FLUX.2 Pro instead
Absolute maximum quality: FLUX.2 Pro and FLUX.2 Max deliver higher fidelity when latency is not the priority
Masked region inpainting: FLUX.1 Fill Pro is specialized for mask-based fill rather than generative or instruction-based editing

Conclusion

FLUX.2 [klein] 9B combines an 8B Qwen3 text embedder, four-step distillation, and multi-reference support in a model that responds in under half a second. For teams building interactive visual applications or agent loops that iterate rapidly, Klein 9B delivers multi-reference editing at latencies many earlier fast image models did not match.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

FLUX.2 [klein] 9B

Playground

Providers

More models by Black Forest Labs

About FLUX.2 [klein] 9B

What To Consider When Choosing a Provider

When to Use FLUX.2 [klein] 9B

Best for

Consider alternatives when

Conclusion