Skip to content

Mercury Coder Small Beta

inception/mercury-coder-small

Mercury Coder Small Beta is Inception's compact diffusion coding model. Mercury Coder Small Beta scores 90.0 on HumanEval and 84.8 on fill-in-the-middle (FIM).

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'inception/mercury-coder-small',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

Mercury Coder Small Beta's diffusion generation pattern differs from autoregressive streaming. Factor that in when you design editor integrations or autocomplete pipelines that depend on incremental token delivery.

When to Use Mercury Coder Small Beta

Best For

  • IDE inline autocomplete:

    Low response latency for keystroke-level completions

  • Fill-in-the-middle completions:

    Editor completions surrounded by existing code context on both sides

  • High-volume coding pipelines:

    Per-call cost is a significant factor at scale

  • CI/CD test generation:

    Fast unit test and docstring generation triggered inside pipelines

  • Lightweight agent loops:

    Coding agents that make many small inference calls per task

Consider Alternatives When

  • Deep multi-file reasoning:

    Tasks span very large codebases and demand cross-file analysis

  • Competitive programming benchmarks:

    LiveCodeBench-style problems are the primary use case

  • Broad domain knowledge:

    Workload includes long-form prose or complex math proofs beyond code

  • Maximum context window:

    Context length is the binding constraint for the task

Conclusion

Mercury Coder Small Beta brings diffusion-based code generation to contexts where speed and throughput matter most. Its FIM score is 84.8 and HumanEval is 90.0 in Inception's published benchmarks, so it fits teams balancing quality against cost and latency in IDE experiences and high-frequency agent loops.

FAQ

It generates a full draft, then refines all token positions in parallel over iterative passes. Standard code models generate tokens left to right, one at a time. This parallel approach enables higher throughput on the same hardware.

84.8 on FIM benchmarks. FIM measures how well a model generates code that fits between an existing prefix and suffix, which maps to editor autocomplete.

90.0 on HumanEval in Inception's published Mercury Coder tables.

Live throughput metrics appear on this page.

Yes. Its MultiPL-E score is 76.2 across multiple programming languages beyond Python, with Python-centric benchmarks showing its strongest results.

Mercury Coder Small Beta is a smaller, coding-focused model from an earlier generation of the Mercury diffusion family. Mercury 2 is a later, broader reasoning model with a larger context window and tunable reasoning depth.

Inception published HumanEval, FIM, MBPP, and MultiPL-E figures for Mercury Coder in its Mercury announcement. See https://platform.inceptionlabs.ai.

Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.