Google introduced Gemini 2.5 Pro on March 20, 2025 as the flagship of the Gemini 2.5 thinking model generation. Reasoning is its headline capability. Gemini 2.5 models reason through their thoughts before responding, and Google achieved this performance level by combining a significantly enhanced base model with improved post-training. On reasoning benchmarks, 2.5 Pro posts strong results on math and science (including GPQA and AIME 2025) without majority voting or other cost-increasing test-time techniques. On Humanity's Last Exam, a dataset designed by hundreds of subject matter experts to represent the human frontier of knowledge and reasoning, 2.5 Pro scores 18.8% without tool use.
Coding performance received particular attention. Gemini 2.5 Pro represents a significant leap over the 2.0 generation in creating web apps and agentic code applications, along with code transformation and editing. On SWE-Bench Verified, the industry-standard benchmark for agentic code evaluation, it scores 63.8% with a custom agent setup. It can generate a playable video game from a single-line prompt.
Gemini 2.5 Pro ships with a context window of 1.0M tokens, the largest among Gemini 2.5 models, and supports text, audio, images, video, and entire code repositories as input. Tool use including Google Search and code execution is available.