Skip to content
Dashboard

FLUX.2 [klein] 9B

FLUX.2 [klein] 9B is Black Forest Labs's quality-focused small image model in the Klein family. It pairs a 9B flow model with an 8B Qwen3 text embedder and distills inference to four steps for sub-half-second latency.

Image Gen
index.ts
import { experimental_generateImage as generateImage } from 'ai';
const result = await generateImage({
model: 'bfl/flux-2-klein-9b',
prompt: 'A red balloon on a wooden table.'
});

Playground

Try out FLUX.2 [klein] 9B by Black Forest Labs. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

bfl logo
Prompt
Describe what you want the model to generate.
Need inspiration?
Reference images(optional)
Images to generate
bfl logo

Your generated image will appear here

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Black Forest Labs
——
01/15/2026

More models by Black Forest Labs

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
——
bfl logo
01/15/2026
67K
——
bfl logo
12/16/2025
——
bfl logo
11/25/2025
512
——
bfl logo
05/29/2025
512
——
bfl logo
prodia logo
05/29/2025
——
bfl logo
10/01/2024

About FLUX.2 [klein] 9B

FLUX.2 [klein] 9B sits at the quality-focused end of Black Forest Labs's Klein family, released on January 15, 2026. Black Forest Labs describes it as setting the Pareto frontier for quality versus latency across text-to-image, single-reference editing, and multi-reference generation.

The architecture starts with the text encoder. Klein 9B pairs the flow model with an 8B Qwen3 embedder, a large-language-model-class text encoder that brings substantially better prompt understanding than smaller CLIP-based encoders. This Qwen3 integration is the primary reason Klein 9B understands complex, multi-clause prompts more accurately than competing small models.

The 9B flow model is step-distilled to four inference steps. Step distillation compresses the full diffusion trajectory into a small number of high-quality denoising passes. The result is a model that generates images in under half a second while retaining the quality of a much larger, slower undistilled counterpart.

Klein 9B handles text-to-image synthesis, image editing with a single reference, and multi-reference generation (combining multiple input images to blend concepts) within the same weights.

What To Consider When Choosing a Provider

  • Configuration: Klein 9B uses the FLUX Non-Commercial License (NCL). If you're building a commercial production application, consider Klein 4B (Apache 2.0) or FLUX.2 Pro instead. See N/A for current API pricing.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use FLUX.2 [klein] 9B

Best For

  • Quality-latency sweet spot: Klein 9B fits when Klein 4B quality falls short but FLUX.2 Pro latency is too high
  • Multi-reference generation: Workflows that combine multiple concept images into a coherent output at interactive speed
  • Agentic visual systems: Sub-second latency enables rapid iteration loops that generate and evaluate images many times per session
  • Prompt-heavy workflows: Complex, descriptive instructions benefit from the Qwen3 text embedder's language understanding

Consider Alternatives When

  • Unrestricted commercial use: Klein 9B uses the FLUX Non-Commercial License, so choose Klein 4B (Apache 2.0) or FLUX.2 Pro instead
  • Absolute maximum quality: FLUX.2 Pro and FLUX.2 Max deliver higher fidelity when latency is not the priority
  • Masked region inpainting: FLUX.1 Fill Pro is specialized for mask-based fill rather than generative or instruction-based editing

Conclusion

FLUX.2 [klein] 9B combines an 8B Qwen3 text embedder, four-step distillation, and multi-reference support in a model that responds in under half a second. For teams building interactive visual applications or agent loops that iterate rapidly, Klein 9B delivers multi-reference editing at latencies many earlier fast image models did not match.