Trinity Large Thinking
Trinity Large Thinking is a reasoning-focused variant in Arcee AI's Trinity Large family: a 398B-parameter sparse mixture-of-experts model with about 13B active parameters per token, built on Trinity Large Base and emphasizing extended chain-of-thought reasoning.
import { streamText } from 'ai'
const result = streamText({ model: 'arcee-ai/trinity-large-thinking', prompt: 'Why is the sky blue?'})Frequently Asked Questions
What's the difference between Trinity Large Thinking and Trinity Large Preview?
Thinking emits extended chain-of-thought reasoning; Preview does not emphasize trace output. Thinking runs as a 398B sparse MoE with about 13B active parameters per token. Preview is a 400B-parameter (13B active) MoE aimed at long-context reasoning workloads. Choose Thinking when you need explicit reasoning traces; choose Preview when you do not.
Does chain-of-thought reasoning affect token usage?
Yes. Intermediate steps count in the output, so expect higher token use than a short answer from the base preview model. Factor that into cost and latency planning.
When should I use this instead of Trinity Mini?
When you need large-stack reasoning traces more than Mini's cost profile. Trinity Mini uses 26B total parameters with 3B active and fits high-volume, budget-sensitive inference. Trinity Large Thinking fits heavier reasoning and audit-style review, not minimal token use.
Do I need an Arcee AI account to access Trinity Large Thinking on AI Gateway?
No. Use your AI Gateway API key or an OIDC token. You don't need a separate provider account.
Can I use Trinity Large Thinking with the AI SDK?
Yes. Set
modeltoarcee-ai/trinity-large-thinkingin the AI SDK'sstreamTextorgenerateTextcall. AI Gateway also exposes OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and OpenResponses-compatible interfaces.Is the reasoning trace useful for compliance or audit purposes?
It can help. The model can surface intermediate steps you log next to the final answer. You still own retention, access control, and policy for those logs.
Does AI Gateway provide observability for Trinity Large Thinking requests?
Yes. Token usage, latency, and cost show in your AI Gateway dashboard for each request without extra instrumentation.