At a time when Google has already set the benchmark for artificially intelligent chatbots through the launch of Gemini 3.0 last year, Alphabet Inc. has now announced yet another upgrade to its reasoning mode, Gemini 3 Deep Think, advertising the tool as a primary collaborator for scientists and engineers tackling "messy" real-world data.
In an update shared by CEO Sundar Pichai, it has been shown that Google Gemini Deep Think pushes the limit of AI into territory previously reserved for doctoral-level reasoning.
According to the benchmarks shared by the company, the newly updated Deep Think model has achieved 84.6% on the ARC-AGI 2 test, which is effectively regarded as the gold standard for measuring a model's ability to learn new tasks.
"We've refined Deep Think in close partnership with scientists and researchers to tackle tough, real-world challenges," Pichai said in a post on X.
Gemini 3 Deep Think is getting a significant upgrade. We've refined Deep Think in close partnership with scientists and researchers to tackle tough, real-world challenges.
— Sundar Pichai (@sundarpichai) February 12, 2026
And it's pushing the frontier across the most challenging benchmarks, achieving an unprecedented 84.6% on… pic.twitter.com/5503F4FKcD
Google's new Deep Think model on Gemini has also set a new high-water mark on Humanity's Last Exam, which is an academic reasoning bench that tries to gauge how an AI model performs without the help of any external tools. The model scored an impressive 48.4%, which is significantly higher than the competition. The recently-launched Claude Opus 4.6 comes closest with 40%.
How To Use Google Gemini 3.0 Deep Think
The newly updated model of the Deep Think has been made available for Google AI Ultra subscribers. Enterprises and researchers can now apply for early API access, which would be a first for Google.
This strategic pivot signals Google's intent to capture the high-end enterprise market, specifically in fields like semiconductor crystal growth and pharmaceutical research, where problems often lack a single correct solution.
The upgrade comes as competition in the reasoning space intensifies, with OpenAI's "o1" and Anthropic's Claude 4.6 making big strides in logic.
Essential Business Intelligence, Continuous LIVE TV, Sharp Market Insights, Practical Personal Finance Advice and Latest Stories — On NDTV Profit.