Skip to content
Vercel April 2026 security incident

Trinity Mini

arcee-ai/trinity-mini

Trinity Mini is a 26B-parameter sparse MoE from Arcee AI with 3B active parameters per forward pass. It handles function calling and multi-step agent workflows at low per-token cost, trained end-to-end in the United States.

index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'arcee-ai/trinity-mini',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Zero Data Retention

    AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.

    Authentication

    AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

MoE routing keeps active parameters low per token, which helps cost at scale. At $0.045 per million input tokens and $0.15 per million output tokens, stress-test cost against quality on your traffic.

When to Use Trinity Mini

Best For

  • High-volume reasoning routes:

    Deployments where cost per token is a hard constraint

  • Structured inference tasks:

    Reverse engineering or deduction from partial observations

  • U.S. training provenance:

    Teams that need domestic-only training for policy or procurement

Consider Alternatives When

  • Deepest large-model reasoning:

    Trinity Large Preview offers a larger parameter space at higher cost

  • Long-term enterprise SLA:

    This tier does not offer a fixed enterprise support contract

  • Tight latency budgets:

    Some workloads rule out even a compact MoE path

Conclusion

Trinity Mini pairs MoE efficiency with U.S. training provenance for teams that balance cost, control, and reasoning depth. Match spend to $0.045 and $0.15, then scale what works.

FAQ

The stack is mixture-of-experts with 26B total parameters. Each token activates roughly 3B parameters through routed experts, so cost and latency stay closer to a 3B-class forward pass than a dense 26B run.

Arcee AI ran the full training pipeline in the United States. Buyers who care about geography for compliance or sourcing can use that fact in reviews.

Trinity Mini is the 26B / 3B active open-weight MoE built for efficient volume inference. Trinity Large Preview is the 400B-parameter (13B active) large MoE aimed at heavier long-context reasoning. They sit at different cost-versus-capability points.

No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.

It supports chain-of-thought style traces, including the long-form machine-inference example from the model's AI Gateway announcement. Use that pattern when you need stepwise causal analysis from partial evidence.

Yes. Set model to arcee-ai/trinity-mini in the AI SDK's streamText or generateText call. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.