Google's Gemini 3 brings improved reasoning and multimodal understanding

On Tuesday, Google officially launched Gemini 3, described by CEO Sundar Pichai as the company's "most intelligent model". According to the official annoucement, Gemini 3 brings improvements in four key areas: reasoning skills, multimodal understanding, agentic workflows, and vibe coding capabilities.

Benchmarks continue to be the all-important measure of model performance, and as expected, Google reports impressive benchmark results for Gemini 3 Pro: it topped the LMArena Leaderboard with an Elo score of 1501 and achieved 91.9% on GPQA Diamond for PhD-level reasoning. The model also sets new standards in mathematics (23.4% on MathArena Apex), multimodal tasks (81% on MMMU-Pro and 87.6% on Video-MMMU), and factual accuracy (72.1% on SimpleQA Verified).

Gemini 3 Deep Think mode, which will be available soon to Google AI Ultra subscribers, displays even stronger benchmark performance, with scores of 41% on Humanity's Last Exam and 45.1% on ARC-AGI-2. Google AI Ultra subscribers can also try out Gemini Agent's improved planning and decision-making capabilities. Gemini 3's agentic capabilities led the model to top the Vending-Bench 2 leaderboard, which tests the models' ability to manage a simulated vending machine business.

Google reports that Gemini 3 was able to maintain consistency in the decisions it made and the tools it used during a full simulated year of operation, and expects these capabilities to translate into the ability to support users by taking over multi-step workflows, such as booking services or organizing an email inbox.

Beyond raw performance, Google emphasizes that Gemini 3 delivers more nuanced, direct responses. The company describes it as a "true thought partner" with an improved ability to understand context and intent behind user requests. The official announcement even stresses how Gemini 3 has traded "cliche and flattery" for "genuine insight", claiming that Gemini 3 "[tells] you what you need to hear, not just what you want to hear."

For developers, Google is positioning Gemini 3 as its best "vibe coding" model yet. The company claims that Gemini 3 is adept at zero-shot generation and at processing complex prompts to deliver rich and interactive web interfaces. Alongside the model release, Google introduced Antigravity, a new agentic development platform where AI agents can autonomously plan and execute complex software tasks.

To mark the beginning of this new chapter in the Gemini saga, Google announced it is previewing Gemini 3 Pro and making it available across several of its products, including AI Mode in Search, the Gemini app, AI Studio, Vertex AI, and a new agentic coding platform called Google Antigravity. To enable users to make the most out of the model, Gemini 3 is also available in third-party coding and agentic platforms, including Cursor, GitHub, Manus and Replit.

Subscribe

Google's Gemini 3 brings improved reasoning and multimodal understanding

Comments

Read Next

Google's Gemini 3 brings improved reasoning and multimodal understanding

Sakana AI Secures $135M Series B to Build Sustainable, Japan-Focused AI Models

Google Maps unveils AI-powered tools for interactive map development