DeepSeek R1 0528 is DeepSeek's open-source reasoning model, released January 20, 2025. It scores 79.8% Pass@1 on AIME 2024 and 97.3% on MATH-500. Weights ship under the MIT License for commercial use.
import { streamText } from 'ai'
const result = streamText({ model: 'deepseek/deepseek-r1', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
- Configuration: DeepSeek R1 0528 generates verbose reasoning traces before final answers. Budget output tokens generously and account for variable response length when estimating costs.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use DeepSeek R1 0528
Best For
- Competitive mathematics: Formal proof construction and quantitative reasoning where AIME 2024 and MATH-500 benchmark results match your task
- Code generation and debugging: Algorithm design where RL-derived problem-solving patterns produce self-correcting chains before final output
- Complex analytical reasoning: Multi-step reasoning in finance, science, and engineering where showing work and self-verification build trust
Consider Alternatives When
- Conversation or summarization: Extended reasoning traces add unnecessary output token cost for content generation workloads
- Hybrid thinking modes: DeepSeek-V3.1 or later supports both thinking and non-thinking modes through the same endpoint
- Strict latency requirements: Variable response times from long reasoning chains are not acceptable when latency is a hard constraint
- Pure creative writing: Structured reasoning adds no quality benefit for open-ended generation tasks
Conclusion
DeepSeek R1 0528 matches closed-source models on published benchmarks while shipping weights under the MIT License. For math, code, and formal reasoning workloads, it fits teams that need open weights.
Frequently Asked Questions
How was DeepSeek R1 0528 trained differently from other reasoning models?
DeepSeek applied reinforcement learning directly to the base model, bypassing the conventional step of training on human-written reasoning traces. Reasoning patterns like self-verification and reflection emerged from RL exploration rather than curated data.
What are DeepSeek R1 0528's benchmark scores on mathematics?
79.8% Pass@1 on AIME 2024, on par with OpenAI o1 at release. On MATH-500 it scores 97.3%.
What does the MIT License mean for using DeepSeek R1 0528 outputs commercially?
The MIT License permits commercial use. Many proprietary reasoning models impose stricter restrictions.
What is the context window and architecture of DeepSeek R1 0528?
A context window of 160K tokens. The architecture is Mixture-of-Experts (MoE) with 671B total parameters, activating 37B per forward pass.
When should I use DeepSeek R1 0528 versus DeepSeek-V3 or V3.1?
DeepSeek R1 0528 specializes in deep reasoning with extended chain-of-thought. DeepSeek-V3 and later variants are general-purpose models that balance reasoning with faster, lower-cost completions and suit mixed-workload deployments better.
Does the reasoning trace appear in the API response?
Yes. The chain-of-thought trace appears in the response. This helps with debugging and with applications that display the model's reasoning to end users.