FLUX.2 [klein] 9B
FLUX.2 [klein] 9B is Black Forest Labs's quality-focused small image model in the Klein family. It pairs a 9B flow model with an 8B Qwen3 text embedder and distills inference to four steps for sub-half-second latency.
import { experimental_generateImage as generateImage } from 'ai';
const result = await generateImage({ model: 'bfl/flux-2-klein-9b', prompt: 'A red balloon on a wooden table.'});Playground
Try out FLUX.2 [klein] 9B by Black Forest Labs. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
More models by Black Forest Labs
| Model |
|---|
About FLUX.2 [klein] 9B
FLUX.2 [klein] 9B sits at the quality-focused end of Black Forest Labs's Klein family, released on N/A. Black Forest Labs describes it as setting the Pareto frontier for quality versus latency across text-to-image, single-reference editing, and multi-reference generation.
The architecture starts with the text encoder. Klein 9B pairs the flow model with an 8B Qwen3 embedder, a large-language-model-class text encoder that brings substantially better prompt understanding than smaller CLIP-based encoders. This Qwen3 integration is the primary reason Klein 9B understands complex, multi-clause prompts more accurately than competing small models.
The 9B flow model is step-distilled to four inference steps. Step distillation compresses the full diffusion trajectory into a small number of high-quality denoising passes. The result is a model that generates images in under half a second while retaining the quality of a much larger, slower undistilled counterpart.
Klein 9B handles text-to-image synthesis, image editing with a single reference, and multi-reference generation (combining multiple input images to blend concepts) within the same weights.
What To Consider When Choosing a Provider
- Configuration: Klein 9B uses the FLUX Non-Commercial License (NCL). If you're building a commercial production application, consider Klein 4B (Apache 2.0) or FLUX.2 Pro instead. See N/A for current API pricing.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use FLUX.2 [klein] 9B
Best For
- Quality-latency sweet spot: Klein 9B fits when Klein 4B quality falls short but FLUX.2 Pro latency is too high
- Multi-reference generation: Workflows that combine multiple concept images into a coherent output at interactive speed
- Agentic visual systems: Sub-second latency enables rapid iteration loops that generate and evaluate images many times per session
- Prompt-heavy workflows: Complex, descriptive instructions benefit from the Qwen3 text embedder's language understanding
Consider Alternatives When
- Unrestricted commercial use: Klein 9B uses the FLUX Non-Commercial License, so choose Klein 4B (Apache 2.0) or FLUX.2 Pro instead
- Absolute maximum quality: FLUX.2 Pro and FLUX.2 Max deliver higher fidelity when latency is not the priority
- Masked region inpainting: FLUX.1 Fill Pro is specialized for mask-based fill rather than generative or instruction-based editing
Conclusion
FLUX.2 [klein] 9B combines an 8B Qwen3 text embedder, four-step distillation, and multi-reference support in a model that responds in under half a second. For teams building interactive visual applications or agent loops that iterate rapidly, Klein 9B delivers multi-reference editing at latencies many earlier fast image models did not match.
Frequently Asked Questions
Why does Klein 9B use a Qwen3 text embedder instead of a CLIP-based encoder?
Qwen3 is a large-language-model-class text encoder that understands complex, multi-clause natural language prompts at a level that smaller CLIP-based encoders cannot match. This gives Klein 9B substantially better prompt adherence, particularly for detailed or compositionally complex image descriptions.
What is four-step distillation and why does it matter?
Step distillation compresses the full denoising trajectory of a diffusion model into fewer inference steps without proportional quality loss. Klein 9B uses four steps, compared to the 20-50 steps a standard diffusion model might require. This enables sub-0.5-second generation while retaining high output quality.
Does Klein 9B support multi-reference generation?
Yes. Multiple input images can be provided to blend concepts and guide composition at sub-second speed, a capability Black Forest Labs describes as uncommon for a model operating at this latency.
What license covers FLUX.2 [klein] 9B?
Klein 9B uses the FLUX Non-Commercial License (NCL). Black Forest Labs renamed it from the FLUX [dev] Non-Commercial License and states there were no material changes to the license terms. Commercial production applications require Klein 4B (Apache 2.0) or a commercial Black Forest Labs offering.
How does Klein 9B's quality compare to much larger models?
Klein 9B matches or exceeds models 5x its size in image quality benchmarks, attributed to the combination of the Qwen3 embedder, the 9B flow architecture, and step distillation training.
What does FLUX.2 [klein] 9B cost per image?
Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.