FLUX.2 [klein] 9B
FLUX.2 [klein] 9B is Black Forest Labs's quality-focused small image model in the Klein family. It pairs a 9B flow model with an 8B Qwen3 text embedder and distills inference to four steps for sub-half-second latency.
import { experimental_generateImage as generateImage } from 'ai';
const result = await generateImage({ model: 'bfl/flux-2-klein-9b', prompt: 'A red balloon on a wooden table.'});Playground
Try out FLUX.2 [klein] 9B by Black Forest Labs. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.
Your generated image will appear here
Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
More models by Black Forest Labs
| Model |
|---|
About FLUX.2 [klein] 9B
FLUX.2 [klein] 9B sits at the quality-focused end of Black Forest Labs's Klein family, released on January 15, 2026. Black Forest Labs describes it as setting the Pareto frontier for quality versus latency across text-to-image, single-reference editing, and multi-reference generation.
The architecture starts with the text encoder. Klein 9B pairs the flow model with an 8B Qwen3 embedder, a large-language-model-class text encoder that brings substantially better prompt understanding than smaller CLIP-based encoders. This Qwen3 integration is the primary reason Klein 9B understands complex, multi-clause prompts more accurately than competing small models.
The 9B flow model is step-distilled to four inference steps. Step distillation compresses the full diffusion trajectory into a small number of high-quality denoising passes. The result is a model that generates images in under half a second while retaining the quality of a much larger, slower undistilled counterpart.
Klein 9B handles text-to-image synthesis, image editing with a single reference, and multi-reference generation (combining multiple input images to blend concepts) within the same weights.
What To Consider When Choosing a Provider
- Configuration: Klein 9B uses the FLUX Non-Commercial License (NCL). If you're building a commercial production application, consider Klein 4B (Apache 2.0) or FLUX.2 Pro instead. See N/A for current API pricing.
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use FLUX.2 [klein] 9B
Best For
- Quality-latency sweet spot: Klein 9B fits when Klein 4B quality falls short but FLUX.2 Pro latency is too high
- Multi-reference generation: Workflows that combine multiple concept images into a coherent output at interactive speed
- Agentic visual systems: Sub-second latency enables rapid iteration loops that generate and evaluate images many times per session
- Prompt-heavy workflows: Complex, descriptive instructions benefit from the Qwen3 text embedder's language understanding
Consider Alternatives When
- Unrestricted commercial use: Klein 9B uses the FLUX Non-Commercial License, so choose Klein 4B (Apache 2.0) or FLUX.2 Pro instead
- Absolute maximum quality: FLUX.2 Pro and FLUX.2 Max deliver higher fidelity when latency is not the priority
- Masked region inpainting: FLUX.1 Fill Pro is specialized for mask-based fill rather than generative or instruction-based editing
Conclusion
FLUX.2 [klein] 9B combines an 8B Qwen3 text embedder, four-step distillation, and multi-reference support in a model that responds in under half a second. For teams building interactive visual applications or agent loops that iterate rapidly, Klein 9B delivers multi-reference editing at latencies many earlier fast image models did not match.