Skip to content

DeepSeek R1 0528

View Status

DeepSeek R1 0528 is DeepSeek's open-source reasoning model, released January 20, 2025. It scores 79.8% Pass@1 on AIME 2024 and 97.3% on MATH-500. Weights ship under the MIT License for commercial use.

ReasoningImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-r1',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Configuration: DeepSeek R1 0528 generates verbose reasoning traces before final answers. Budget output tokens generously and account for variable response length when estimating costs.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use DeepSeek R1 0528

Best For

  • Competitive mathematics: Formal proof construction and quantitative reasoning where AIME 2024 and MATH-500 benchmark results match your task
  • Code generation and debugging: Algorithm design where RL-derived problem-solving patterns produce self-correcting chains before final output
  • Complex analytical reasoning: Multi-step reasoning in finance, science, and engineering where showing work and self-verification build trust

Consider Alternatives When

  • Conversation or summarization: Extended reasoning traces add unnecessary output token cost for content generation workloads
  • Hybrid thinking modes: DeepSeek-V3.1 or later supports both thinking and non-thinking modes through the same endpoint
  • Strict latency requirements: Variable response times from long reasoning chains are not acceptable when latency is a hard constraint
  • Pure creative writing: Structured reasoning adds no quality benefit for open-ended generation tasks

Conclusion

DeepSeek R1 0528 matches closed-source models on published benchmarks while shipping weights under the MIT License. For math, code, and formal reasoning workloads, it fits teams that need open weights.

Frequently Asked Questions

  • How was DeepSeek R1 0528 trained differently from other reasoning models?

    DeepSeek applied reinforcement learning directly to the base model, bypassing the conventional step of training on human-written reasoning traces. Reasoning patterns like self-verification and reflection emerged from RL exploration rather than curated data.

  • What are DeepSeek R1 0528's benchmark scores on mathematics?

    79.8% Pass@1 on AIME 2024, on par with OpenAI o1 at release. On MATH-500 it scores 97.3%.

  • What does the MIT License mean for using DeepSeek R1 0528 outputs commercially?

    The MIT License permits commercial use. Many proprietary reasoning models impose stricter restrictions.

  • What is the context window and architecture of DeepSeek R1 0528?

    A context window of 160K tokens. The architecture is Mixture-of-Experts (MoE) with 671B total parameters, activating 37B per forward pass.

  • When should I use DeepSeek R1 0528 versus DeepSeek-V3 or V3.1?

    DeepSeek R1 0528 specializes in deep reasoning with extended chain-of-thought. DeepSeek-V3 and later variants are general-purpose models that balance reasoning with faster, lower-cost completions and suit mixed-workload deployments better.

  • Does the reasoning trace appear in the API response?

    Yes. The chain-of-thought trace appears in the response. This helps with debugging and with applications that display the model's reasoning to end users.