Skip to content

FLUX.2 [klein] 9B

bfl/flux-2-klein-9b

FLUX.2 [klein] 9B is Black Forest Labs's quality-focused small image model in the Klein family. It pairs a 9B flow model with an 8B Qwen3 text embedder and distills inference to four steps for sub-half-second latency.

Image Gen
index.ts
import { experimental_generateImage as generateImage } from 'ai';
const result = await generateImage({
model: 'bfl/flux-2-klein-9b',
prompt: 'A red balloon on a wooden table.'
});

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

Klein 9B uses the FLUX Non-Commercial License (NCL). If you're building a commercial production application, consider Klein 4B (Apache 2.0) or FLUX.2 Pro instead. See N/A for current API pricing.

When to Use FLUX.2 [klein] 9B

Best For

  • Quality-latency sweet spot:

    Klein 9B fits when Klein 4B quality falls short but FLUX.2 Pro latency is too high

  • Multi-reference generation:

    Workflows that combine multiple concept images into a coherent output at interactive speed

  • Agentic visual systems:

    Sub-second latency enables rapid iteration loops that generate and evaluate images many times per session

  • Prompt-heavy workflows:

    Complex, descriptive instructions benefit from the Qwen3 text embedder's language understanding

Consider Alternatives When

  • Unrestricted commercial use:

    Klein 9B uses the FLUX Non-Commercial License, so choose Klein 4B (Apache 2.0) or FLUX.2 Pro instead

  • Absolute maximum quality:

    FLUX.2 Pro and FLUX.2 Max deliver higher fidelity when latency is not the priority

  • Masked region inpainting:

    FLUX.1 Fill Pro is specialized for mask-based fill rather than generative or instruction-based editing

Conclusion

FLUX.2 [klein] 9B combines an 8B Qwen3 text embedder, four-step distillation, and multi-reference support in a model that responds in under half a second. For teams building interactive visual applications or agent loops that iterate rapidly, Klein 9B delivers multi-reference editing at latencies many earlier fast image models did not match.

FAQ

Qwen3 is a large-language-model-class text encoder that understands complex, multi-clause natural language prompts at a level that smaller CLIP-based encoders cannot match. This gives Klein 9B substantially better prompt adherence, particularly for detailed or compositionally complex image descriptions.

Step distillation compresses the full denoising trajectory of a diffusion model into fewer inference steps without proportional quality loss. Klein 9B uses four steps, compared to the 20-50 steps a standard diffusion model might require. This enables sub-0.5-second generation while retaining high output quality.

Yes. Multiple input images can be provided to blend concepts and guide composition at sub-second speed, a capability Black Forest Labs describes as uncommon for a model operating at this latency.

Klein 9B uses the FLUX Non-Commercial License (NCL). Black Forest Labs renamed it from the FLUX [dev] Non-Commercial License and states there were no material changes to the license terms. Commercial production applications require Klein 4B (Apache 2.0) or a commercial Black Forest Labs offering.

Klein 9B matches or exceeds models 5x its size in image quality benchmarks, attributed to the combination of the Qwen3 embedder, the 9B flow architecture, and step distillation training.

Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.