Skip to content

Mercury Coder Small Beta

Mercury Coder Small Beta is Inception's compact diffusion coding model. Mercury Coder Small Beta scores 90.0 on HumanEval and 84.8 on fill-in-the-middle (FIM).

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'inception/mercury-coder-small',
prompt: 'Why is the sky blue?'
})

Playground

Try out Mercury Coder Small Beta by Inception. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Inception
Legal:Terms
Privacy
32K
0.4s
$0.25/M$1.00/M
02/26/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Inception

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
128K
0.9s
$0.25/M$0.75/M
Read:$0.03/M
Write:
inception logo
02/24/2026

About Mercury Coder Small Beta

Mercury Coder Small Beta belongs to the Mercury family of diffusion large language models (dLLMs) from Inception Labs. Unlike transformer-based code models that emit tokens one at a time, Mercury Coder Small Beta uses a coarse-to-fine generation process. It produces a rough complete draft and refines all positions in parallel over a small number of passes. Mercury Coder Small Beta runs faster than autoregressive alternatives at comparable quality tiers. Live metrics on this page show current rates.

Mercury Coder Small Beta scores 90.0 on HumanEval and 84.8 on fill-in-the-middle (FIM) tasks. FIM maps directly to IDE autocomplete, where the model completes code surrounded by existing context on both sides. Its MBPP score of 76.6 and MultiPL-E score of 76.2 reflect results across Python-centric and multi-language coding evaluations.

The model targets high-frequency, latency-sensitive coding applications: inline completions, documentation generation triggered on keystrokes, and fast unit test synthesis. At $0.25 input / $1.0 output per million tokens, Mercury Coder Small Beta suits developers who need reliable code quality without the cost or latency overhead of frontier-scale models on every request.

What To Consider When Choosing a Provider

  • Configuration: Mercury Coder Small Beta's diffusion generation pattern differs from autoregressive streaming. Factor that in when you design editor integrations or autocomplete pipelines that depend on incremental token delivery.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Mercury Coder Small Beta

Best For

  • IDE inline autocomplete: Low response latency for keystroke-level completions
  • Fill-in-the-middle completions: Editor completions surrounded by existing code context on both sides
  • High-volume coding pipelines: Per-call cost is a significant factor at scale
  • CI/CD test generation: Fast unit test and docstring generation triggered inside pipelines
  • Lightweight agent loops: Coding agents that make many small inference calls per task

Consider Alternatives When

  • Deep multi-file reasoning: Tasks span very large codebases and demand cross-file analysis
  • Competitive programming benchmarks: LiveCodeBench-style problems are the primary use case
  • Broad domain knowledge: Workload includes long-form prose or complex math proofs beyond code
  • Maximum context window: Context length is the binding constraint for the task

Conclusion

Mercury Coder Small Beta brings diffusion-based code generation to contexts where speed and throughput matter most. Its FIM score is 84.8 and HumanEval is 90.0 in Inception's published benchmarks, so it fits teams balancing quality against cost and latency in IDE experiences and high-frequency agent loops.

Frequently Asked Questions

  • How does Mercury Coder Small Beta's diffusion approach differ from standard code models?

    It generates a full draft, then refines all token positions in parallel over iterative passes. Standard code models generate tokens left to right, one at a time. This parallel approach enables higher throughput on the same hardware.

  • What is Mercury Coder Small Beta's fill-in-the-middle score?

    84.8 on FIM benchmarks. FIM measures how well a model generates code that fits between an existing prefix and suffix, which maps to editor autocomplete.

  • How does Mercury Coder Small Beta perform on HumanEval?

    90.0 on HumanEval in Inception's published Mercury Coder tables.

  • What throughput does Mercury Coder Small Beta achieve?

    Live throughput metrics appear on this page.

  • Is Mercury Coder Small Beta suitable for multi-language coding tasks?

    Yes. Its MultiPL-E score is 76.2 across multiple programming languages beyond Python, with Python-centric benchmarks showing its strongest results.

  • How does Mercury Coder Small Beta relate to Mercury 2?

    Mercury Coder Small Beta is a smaller, coding-focused model from an earlier generation of the Mercury diffusion family. Mercury 2 is a later, broader reasoning model with a larger context window and tunable reasoning depth.

  • Where are the benchmark numbers published?

    Inception published HumanEval, FIM, MBPP, and MultiPL-E figures for Mercury Coder in its Mercury announcement. See https://platform.inceptionlabs.ai.

  • What does Mercury Coder Small Beta cost?

    Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.