Google Defends Chrome's Agent-Based Security Model

For the time being, please block all AI-powered browsers to minimize exposure to risk.

A Gartner document published last week makes this recommendation to CISOs.

Google may not have stayed unresponsive to it. A few days later, a security blog post appeared, devoted to agentic browsing in Chrome — being tested since September.

The American group highlights its “hybrid” defense approach that blends deterministic and probabilistic layers. It is accompanied by a link to another post, dated June, focused on prompt injection in Gemini (both in the app and within Google Workspace).

This post already referred to the layered defense approach. Among the techniques listed:

Training Gemini with adversarial data to improve its resilience
Compiling a vulnerability dataset to train classifiers capable of detecting malicious instructions
Adding prompts to remind Gemini to focus on the requested tasks and to ignore potential antagonistic instructions
Detection and masking of suspicious URLs based on Safe Browsing technology
Requesting user confirmation for certain actions and providing information when an attack is blocked

Paraphrase, spotlighting… Strategies for ignoring problematic content

In its post about agentic browsing in Chrome, Google also refers to its “security principles for agents.” Synthesized in a document published in the spring, they are detailed further in a white paper on Gemini 2.5 security, published in parallel. Among the defensive strategies presented there, beyond the ones mentioned above, there is paraphrasing, which involves having incoming data rewritten by a separate AI to invalidate problematic instructions.
There is also spotlighting. This prompt engineering technique involves inserting markers in the input data to allow the model to distinguish blocks of unreliable tokens. Microsoft dedicated an article to it last year. It outlines three cumulative variants called delimitation, datamarking, and encoding. The first marks the location of an input in the prompt by attaching one or more special tokens. The second goes further by weaving a special token across the entire text. The third makes the delimitation even more evident by transforming the text with an algorithm like base64 or ROT13.

A flow-extraction technique from DeepMind

In Chrome’s agentic security model, the spotlighting helps power what Google calls the user alignment critic. The aim: ensure the model prefers following user and system instructions over those derived from external content. The architecture adopted here draws on the dual-LLM pattern (in addition to the main model, a planner, a secondary model steps in whenever working with unreliable content). It also feeds on a technique that DeepMind has named CaMeL. Its principle, broadly speaking: explicitly extract the control and data flows from reliable requests, so that unreliable data retrieved by the LLM can never affect the program’s run-time flow.

The secondary model implemented in Chrome checks each action after planning and determines whether it serves the declared objective. It only receives metadata, which prevents it from being poisoned by external content. If an action is rejected, the primary model receives feedback to reformulate its plan. If rejections pile up, control can be handed over to the user.

Static defenses like spotlighting have their limits in the face of adaptive attacks. In this context, it is necessary to harden the primary model by leveraging its reasoning capabilities. This touches on the other component of the “hybrid” approach. In this domain, besides the elements already discussed, one can develop models tasked with analyzing agents’ action plans and predicting the risk of undesirable outcomes.

Agent-adapted same-origin rules

Some fundamental building blocks of Chrome’s security model are carried over into agentic systems. Site isolation is part of it (pages tied to different sites are always kept in separate processes, each running in its own sandbox). The same-origin policy rules apply as well. They limit how documents and scripts from a given origin can interact with resources from another origin. For example, by blocking JavaScript from accessing a document inside an iframe or by preventing cross-origin binary data access from an image. Adapted to agents, they allow access only to data whose origin has a connection to the task at hand or that the user has explicitly shared.

For each task, a portilling function decides which origins are relevant. They are then split into two sets, tracked for each session. On one side, read-only origins (Gemini can consume their content). On the other, read-write origins (Gemini can perform actions, such as clicking and typing). If an iframe’s origin is not on the list of relevant items, the model does not see its content. This also applies to content retrieved from tool calls.

As with the user alignment critic, the portilling functions are not exposed to external content.
Finding the right balance on the first try is difficult, Google admits. In this sense, the mechanism currently implemented only follows the read-write set.

The Chrome bug bounty program clarified for agentic use

When navigating to certain sensitive sites (controlled via a list), the agent asks the user for confirmation. The same applies to signing into an account from the Google Password Manager. More broadly, whenever the model deems a task to be sensitive, it can request permission or hand control to the user.

Google used this to update Chrome’s bug bounty guidelines. It clarifies the agentic vulnerabilities that can qualify for a reward.

The highest reward (20,000 USD) applies to attacks that modify the state of accounts or data. For example, an indirect prompt injection enabling a payment or account deletion without user confirmation. This amount is awarded only in cases of strong impact, reproducibility across many sites, success on at least half of attempts, and no close linkage to the user prompt.

The maximum reward is set at 10,000 USD for attacks that could lead to exfiltration of sensitive data. And at 3,000 USD for those that would bypass agentic security elements.

Google Defends Chrome’s Agent-Based Security Model

Paraphrase, spotlighting… Strategies for ignoring problematic content

A flow-extraction technique from DeepMind

Agent-adapted same-origin rules

The Chrome bug bounty program clarified for agentic use