Mercury Coder Small Beta
Mercury Coder Small Beta is Inception's compact diffusion coding model. Mercury Coder Small Beta scores 90.0 on HumanEval and 84.8 on fill-in-the-middle (FIM).
import { streamText } from 'ai'
const result = streamText({ model: 'inception/mercury-coder-small', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
Zero Data Retention
AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
Mercury Coder Small Beta's diffusion generation pattern differs from autoregressive streaming. Factor that in when you design editor integrations or autocomplete pipelines that depend on incremental token delivery.
When to Use Mercury Coder Small Beta
Best For
IDE inline autocomplete:
Low response latency for keystroke-level completions
Fill-in-the-middle completions:
Editor completions surrounded by existing code context on both sides
High-volume coding pipelines:
Per-call cost is a significant factor at scale
CI/CD test generation:
Fast unit test and docstring generation triggered inside pipelines
Lightweight agent loops:
Coding agents that make many small inference calls per task
Consider Alternatives When
Deep multi-file reasoning:
Tasks span very large codebases and demand cross-file analysis
Competitive programming benchmarks:
LiveCodeBench-style problems are the primary use case
Broad domain knowledge:
Workload includes long-form prose or complex math proofs beyond code
Maximum context window:
Context length is the binding constraint for the task
Conclusion
Mercury Coder Small Beta brings diffusion-based code generation to contexts where speed and throughput matter most. Its FIM score is 84.8 and HumanEval is 90.0 in Inception's published benchmarks, so it fits teams balancing quality against cost and latency in IDE experiences and high-frequency agent loops.
FAQ
It generates a full draft, then refines all token positions in parallel over iterative passes. Standard code models generate tokens left to right, one at a time. This parallel approach enables higher throughput on the same hardware.
84.8 on FIM benchmarks. FIM measures how well a model generates code that fits between an existing prefix and suffix, which maps to editor autocomplete.
90.0 on HumanEval in Inception's published Mercury Coder tables.
Live throughput metrics appear on this page.
Yes. Its MultiPL-E score is 76.2 across multiple programming languages beyond Python, with Python-centric benchmarks showing its strongest results.
Mercury Coder Small Beta is a smaller, coding-focused model from an earlier generation of the Mercury diffusion family. Mercury 2 is a later, broader reasoning model with a larger context window and tunable reasoning depth.
Inception published HumanEval, FIM, MBPP, and MultiPL-E figures for Mercury Coder in its Mercury announcement. See https://platform.inceptionlabs.ai.
Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.