Skip to content

Trinity Mini

Trinity Mini is a 26B-parameter sparse MoE from Arcee AI with 3B active parameters per forward pass. It handles function calling and multi-step agent workflows at low per-token cost, trained end-to-end in the United States.

index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'arcee-ai/trinity-mini',
prompt: 'Why is the sky blue?'
})

Playground

Try out Trinity Mini by Arcee AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Arcee AI
Legal:Terms
Privacy
131K
0.4s
285tps
$0.04/M$0.15/M
12/01/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Arcee AI

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
262K
0.3s
240tps
$0.25/M$0.90/M
arcee-ai logo
04/01/2026
131K
0.3s
130tps
$0.25/M$1.00/M
arcee-ai logo
01/01/2025

About Trinity Mini

Trinity Mini is a sparse mixture-of-experts model with 26B total parameters and 3B active per forward pass. The compact active footprint keeps inference costs low while the full parameter set provides enough capacity for function calling and multi-step agent workflows.

Arcee AI trained the model end-to-end in the United States, which teams cite for sovereignty and sourcing reviews.

Use https://docs.arcee.ai/language-models/trinity-mini-26b for weights and licensing detail.

What To Consider When Choosing a Provider

  • Configuration: MoE routing keeps active parameters low per token, which helps cost at scale. At $0.045 per million input tokens and $0.15 per million output tokens, stress-test cost against quality on your traffic.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Trinity Mini

Best For

  • High-volume reasoning routes: Deployments where cost per token is a hard constraint
  • Structured inference tasks: Reverse engineering or deduction from partial observations
  • U.S. training provenance: Teams that need domestic-only training for policy or procurement

Consider Alternatives When

  • Deepest large-model reasoning: Trinity Large Preview offers a larger parameter space at higher cost
  • Long-term enterprise SLA: This tier does not offer a fixed enterprise support contract
  • Tight latency budgets: Some workloads rule out even a compact MoE path

Conclusion

Trinity Mini pairs MoE efficiency with U.S. training provenance for teams that balance cost, control, and reasoning depth. Match spend to $0.045 and $0.15, then scale what works.

Frequently Asked Questions

  • What does "26B parameters, 3B active" mean in practice?

    The stack is mixture-of-experts with 26B total parameters. Each token activates roughly 3B parameters through routed experts, so cost and latency stay closer to a 3B-class forward pass than a dense 26B run.

  • What does "trained end-to-end in the U.S." mean?

    Arcee AI ran the full training pipeline in the United States. Buyers who care about geography for compliance or sourcing can use that fact in reviews.

  • How is Trinity Mini different from Trinity Large Preview?

    Trinity Mini is the 26B / 3B active open-weight MoE built for efficient volume inference. Trinity Large Preview is the 400B-parameter (13B active) large MoE aimed at heavier long-context reasoning. They sit at different cost-versus-capability points.

  • Do I need a separate Arcee AI account to access Trinity Mini on AI Gateway?

    No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.

  • What reasoning style does Trinity Mini use?

    It supports chain-of-thought style traces, including the long-form machine-inference example from the model's AI Gateway announcement. Use that pattern when you need stepwise causal analysis from partial evidence.

  • Can I use Trinity Mini with the AI SDK?

    Yes. Set model to arcee-ai/trinity-mini in the AI SDK's streamText or generateText call. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.