Skip to content

Command A

Command A is an open-weights enterprise model from Cohere built for tool use, agentic workflows, retrieval-augmented generation (RAG), and multilingual tasks across a context window of 256K tokens.

Tool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'cohere/command-a',
prompt: 'Why is the sky blue?'
})

Playground

Try out Command A by Cohere. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
ZDR
No Training
Release Date
Cohere
Legal:Terms
Privacy
256K
0.2s
79tps
$2.50/M$10.00/M
03/13/2025
Throughput

P50 throughput on live AI Gateway traffic, in tokens per second (TPS). Visit the docs for more info.

Latency

P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.

Uptime

Direct request success rate on AI Gateway and per-provider. Visit the docs for more info.

More models by Cohere

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
32K
$2/K
cohere logo
12/11/2025
32K
$2.5/K
cohere logo
12/11/2025
$0.12/M
cohere logo
04/15/2025
4K
$2/K
bedrock logo
12/02/2024

About Command A

Command A is a 111-billion-parameter open-weights language model released March 13, 2025. It uses a transformer stack with three sliding-window attention layers (4,096-token window each) and one global attention layer. The global layer allows token interaction across the full context of 256K tokens. This hybrid attention design handles local context and long-range dependencies in one request.

Tool use and multi-step ReAct agent behavior are core strengths. Command A delivers higher throughput than Command R+ 08-2024 on comparable setups. Structured output, citation generation, safety modes, and RAG are built-in API features rather than prompt-only patterns. The knowledge cutoff is June 2024.

Command A covers 23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese (Mandarin), Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian. This breadth supports multilingual enterprise deployments without separate regional models.

See https://docs.cohere.com/v2/docs/command-a for API details. Typical output length is capped at 8K tokens per generation where the API enforces a limit.

What To Consider When Choosing a Provider

  • Configuration: For enterprise deployments across regions or language markets, confirm your provider choice meets data residency needs. Command A lists 23 training languages, so residency and language policy still matter.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Command A

Best For

  • Multi-step ReAct agents: Tool use and decision chaining spanning many turns
  • Large-corpus RAG: Context window of 256K tokens fits retrieval results, source documents, and instructions in one prompt
  • Multilingual applications: Single model covers any of the 23 supported training languages without separate regional variants
  • Citation workflows: Legal review, regulatory compliance, and research summarization where traceable sourcing is required
  • High-throughput production: Higher throughput than Command R+ 08-2024 cuts infrastructure cost at scale

Consider Alternatives When

  • Creative generation focus: Workload centers on open-ended conversation rather than structured enterprise tasks
  • Reasoning specialist needed: Mathematical or formal logic chains need a reasoning-tuned model
  • Smaller context suffices: A smaller model would reduce per-token cost at your volume
  • Unsupported language: Your target language falls outside the 23 supported languages

Conclusion

Command A targets enterprise teams that need an open-weights foundation for agent systems, multilingual deployments, and retrieval-heavy pipelines with unified billing through AI Gateway. It brings a context of 256K tokens, 23-language training coverage, and the throughput gains Cohere cites over Command R+ 08-2024.

Frequently Asked Questions

  • What is Command A's hybrid attention architecture?

    It combines three sliding-window attention layers (4,096-token window each) with one global attention layer that has no positional constraint. The sliding windows handle local context efficiently. The global layer models long-range dependencies across the full context window of 256K tokens.

  • How does Command A compare to Command R+ 08-2024?

    Command A delivers higher throughput than Command R+ 08-2024. Cohere positions it as the successor tuned for agentic performance and enterprise task execution.

  • Which languages does Command A support?

    23 languages: English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese (Mandarin), Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.

  • Does Command A support citation generation natively?

    Yes. Citation generation is a built-in feature, so you can ground outputs in retrieved or cited sources without bolting on a separate citation stack.

  • What is the knowledge cutoff date for Command A?

    June 2024.

  • Is Command A open-weights?

    Yes. Cohere released it as an open-weights model. On Hugging Face it lives under CohereLabs/c4ai-command-a-03-2025.

  • How much does Command A cost on AI Gateway?

    See the pricing section on this page for today's rates. AI Gateway exposes each provider's pricing for Command A.