What does "26B parameters, 3B active" mean in practice?

The stack is mixture-of-experts with 26B total parameters. Each token activates roughly 3B parameters through routed experts, so cost and latency stay closer to a 3B-class forward pass than a dense 26B run.

What does "trained end-to-end in the U.S." mean?

Arcee AI ran the full training pipeline in the United States. Buyers who care about geography for compliance or sourcing can use that fact in reviews.

How is Trinity Mini different from Trinity Large Preview?

Trinity Mini is the 26B / 3B active open-weight MoE built for efficient volume inference. Trinity Large Preview is the 400B-parameter (13B active) large MoE aimed at heavier long-context reasoning. They sit at different cost-versus-capability points.

Do I need a separate Arcee AI account to access Trinity Mini on AI Gateway?

No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.

What reasoning style does Trinity Mini use?

It supports chain-of-thought style traces, including the long-form machine-inference example from the model's AI Gateway announcement. Use that pattern when you need stepwise causal analysis from partial evidence.

Can I use Trinity Mini with the AI SDK?

Yes. Set `model` to `arcee-ai/trinity-mini` in the AI SDK's `streamText` or `generateText` call. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.

Trinity Mini

Trinity Mini is a 26B-parameter sparse MoE from Arcee AI with 3B active parameters per forward pass. It handles function calling and multi-step agent workflows at low per-token cost, trained end-to-end in the United States.

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'arcee-ai/trinity-mini',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Frequently Asked Questions

What does "26B parameters, 3B active" mean in practice?
The stack is mixture-of-experts with 26B total parameters. Each token activates roughly 3B parameters through routed experts, so cost and latency stay closer to a 3B-class forward pass than a dense 26B run.
What does "trained end-to-end in the U.S." mean?
Arcee AI ran the full training pipeline in the United States. Buyers who care about geography for compliance or sourcing can use that fact in reviews.
How is Trinity Mini different from Trinity Large Preview?
Trinity Mini is the 26B / 3B active open-weight MoE built for efficient volume inference. Trinity Large Preview is the 400B-parameter (13B active) large MoE aimed at heavier long-context reasoning. They sit at different cost-versus-capability points.
Do I need a separate Arcee AI account to access Trinity Mini on AI Gateway?
No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.
What reasoning style does Trinity Mini use?
It supports chain-of-thought style traces, including the long-form machine-inference example from the model's AI Gateway announcement. Use that pattern when you need stepwise causal analysis from partial evidence.
Can I use Trinity Mini with the AI SDK?
Yes. Set model to arcee-ai/trinity-mini in the AI SDK's streamText or generateText call. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Trinity Mini

Frequently Asked Questions