Gemini 3 Pro: Day 10 Brings More Tempered Enthusiasm

No, Gemini 3 Pro isn’t everywhere in Google’s ecosystem. But still…

The American company has been remarkably quick to integrate this model into its services, right up to its search engine, at the level of “AI mode.” Initially in the United States, for Google AI Pro and Ultra subscribers, who will also get a preview of automatic routing of queries to the appropriate model.

A model with more interactive responses

With Gemini 3 Pro arrive the “generative UIs.” The model can, in response to queries, display a magazine-like view (visual layout) or even code an interactive canvas (dynamic view).

Also read: IA: China catches up on its “lag” according to the head of Google DeepMind

This capability isn’t limited to Google Search. It’s also available in the Gemini app. The model is accessible to all users there. It is accompanied by a new Gemini Agent feature, currently reserved for AI Ultra subscribers. Inspired by Project Mariner (an autonomous agent for web navigation), it orchestrates multi-step tasks linked to Google services.

Antigravity, a showroom for agentic coding

Google has also cleared space for Gemini 3 Pro within its developer tools*. Among them, a newcomer: Antigravity. This IDE is available in preview on Windows, Mac, and Linux. Alongside the code-editing interface, it pairs another: an agent-control center, structured into workspaces, with a centralized messaging system. In this UI, there is no code: the agents produce “artefacts” (task lists, implementation plans, summaries of actions taken) on which the user can provide feedback without interrupting execution. Gemini 3 Pro can serve as the main model — as Claude Sonnet 4.5 and GPT-OSS do — with two thinking modes: dynamic/high or low.

Vision levels in addition to thinking levels

We find this setting — while awaiting an additional medium option — on the Gemini API, with the parameter thinking_level. It isn’t specific to Gemini 3 Pro, unlike the media_resolution parameter. This one determines the maximum number of tokens allocated to vision. It can be set for each incoming media or globally. If it is not defined, default values are used (1120 tokens per image, 560 per PDF page, 70 per frame of video or 280 for videos that contain a lot of text).

The pricing for Gemini 3 Pro on the Gemini API:

Input: $2 per million tokens for requests under 200,000 tokens (4 $ otherwise)
Output: $12 per million tokens for requests under 200,000 tokens (18 $ otherwise)
Context caching: $0.20 per million tokens for requests under 200,000 tokens (0.40 $ otherwise) ; storage: $4.50/hour per million tokens
Google Search anchoring (not yet available): 5,000 free requests, then $14 per 1,000

As a reminder, Gemini 2.5 Pro is priced at $1.25 and $2.50 for input; at $10 and $15 for output.

At API level 1, the limits are 50 requests per minute, 1,000 tokens per minute and 1,000 requests per day.
At level 2 (at least $250 spent), they rise to 1,000 RPM, 5 million TPM and 50,000 RPJ. At level 3 (at least $1,000), they go to 2000 RPM and 8 million TPM, with no daily request cap.

Gemini 3 Pro also has an image mode, at $2 per million tokens in input (text/image); and, in output, $12 (text/contemplation) or $120 (images). It is distributed in Google products under the Nano Banana Pro brand (following Nano Banana, built on Gemini 2.5 Flash).

Praise… especially for coding

Nano Banana Pro seems to have impressed Andrej Karpathy, OpenAI co-founder and former head of AI at Tesla. He says he generally has a positive impression of Gemini 3 Pro, across personality, humor, writing, and vibe coding.

Gemini Nano Banana Pro can solve exam questions in the exam page image. With doodles, diagrams, all that.

ChatGPT thinks these solutions are all correct except Se_2P_2 should be “diselenium diphosphide” and a spelling mistake (should be “thiocyanic acid” not “thoicyanic”)

:O pic.twitter.com/15oUx8FIqJ

— Andrej Karpathy (@karpathy) November 23, 2025

Marc Benioff, the head of Salesforce, was more emphatic — as usual —: he says he “won’t go back.”

Holy shit. I’ve used ChatGPT every day for 3 years. Just spent 2 hours on Gemini 3. I’m not going back. The leap is insane — reasoning, speed, images, video… everything is sharper and faster. It feels like the world just changed, again. ❤️ 🤖 https://t.co/HruXhc16Mq

— Marc Benioff (@Benioff) November 23, 2025

Among the positive impressions, many concern the agentic coding capabilities.

I asked Gemini 3 Pro to create a 3D LEGO editor.
In one shot it nailed the UI, complex spatial logic, and all the functionality.

We’re entering a new era. pic.twitter.com/Y7OndCB8CK

— Pietro Schirano (@skirano) November 18, 2025

Gemini 3 created this playable maze in just three prompts 🤯🤯🤯

First, it created a top down Gemini maze, and then we asked it to build an app that allows me to upload a pixel maze, and turn it into a playable Three JS scene.

Vibing coding my way through mazes from now on pic.twitter.com/9o3vJhPf4I

— Tulsee Doshi (@tulseedoshi) November 18, 2025

“Useful… when it listens to you”

These capabilities are not without debate, however. Testimonials in the Cursor community illustrate this. They point notably to a high rate of hallucinations and difficulty following instructions, despite notable planning abilities, among other things when compared to OpenAI Codex. The phenomenon appears, according to some, less pronounced in Antigravity.

Also read: Generative AI: Wikipedia finally brings chatbots to checkout

Diverse feedback on the Gemini subreddit also highlights Gemini 3 Pro’s hallucinations. For example:

Confusion between two job postings the model was asked to analyze
Attributing attributes of one character to another during a creative-writing session
Invention of variables in an exercise aimed at producing outputs based on combinations of four variables

Gemini 3 Pro is very useful… when it listens to you, summarizes a user about following instructions. It isn’t alone in noting that the model can be arbitrary at times.

Others mention a certain laziness, detrimental especially to creative writing. A reflection, in a way, of Google’s promises: a model that is “concise” and “direct,” “without clichés or flattery”…

The benchmark effect

Beyond the performance metrics Google quotes, Gemini 3 Pro stands out on the LMArena benchmark. It has topped the charts in several evaluations. In the latest scoring:

Text: 1492 points (versus 1482 for Grok 4.1 Thinking and 1466 for Claude Opus 4.5)
Vision: 1324 points (versus 1249 for Gemini 2.5 Pro and 1237 for GPT-4o)
Image generation: 1242 points (versus 1161 for Tencent Hunyan Image 3.0 and 1158 for Gemini 2.5 Flash)
Image editing: 1371 points (versus 1330 for Gemini 2.5 Flash and 1311 for Seedream 4 by ByteDance)

Gemini 3 Pro’s performance is also notable on another benchmark: ARC-AGI-2. This one focuses on knowledge that is “innate” to humans or learned very early in life. It therefore excludes tasks involving languages, which are cultural acquisitions. The aim is to illustrate generalization capabilities. It involves, among other things, symbolic interpretation (understanding the meaning of visual symbols) and compositional reasoning (simultaneous application of several interdependent rules).

Gemini 3 Pro Deep Think reaches, on ARC-AGI-2, a score of 45.1%, at a cost of $77.16 per task. The gap with Claude Opus 4.5 Thinking is notable: which is cheaper though — 37.6% and $2.40 per task in 64k; 30.6% and $1.29 per task in 32k; 22.8% and $0.79 per task in 16k. Next come GPT-5 Pro (18.3% and $7.14 per task) and Grok 4 Thinking (16% and $2.17 per task).

Congrats to Google on Gemini 3! Looks like a great model.

— Sam Altman (@sama) November 18, 2025

* Gemini 3 Pro is also available in the latest version of Android Studio (including free usage), in the Firebase AI Logic SDKs (Blaze subscription; not yet possible to adjust the level of reasoning) and in the Gemini CLI (Ultra subscription and Gemini API keys; coming to Gemini Code Assist Enterprise; waitlist for other users). As well as in various third-party services (Cursor, GitHub, JetBrains, Manus, Replit…).