Google introduces Gemini 2.5 Computer Use model to automate web and mobile interfaces

Gemini 25

Google has announced the Gemini 2.5 Computer Use model, a specialized AI built on Gemini 2.5 Pro that can directly interact with user interfaces. Instead of relying only on structured APIs, this model allows agents to perform tasks inside web browsers and apps the same way humans do by clicking, typing, scrolling, and filling out forms. Developers can now try it in preview through the Gemini API on Google AI Studio and Vertex AI.

The model is designed for tasks that still require manual interaction, like form submissions, dropdowns, or navigating behind logins. It works in a loop where the model receives a screenshot of the environment and a history of actions, then decides on the next step such as clicking a button. After each step, the model gets an updated screenshot and continues until the task is done, interrupted, or flagged by safety controls. It is tuned for browsers but also shows promise for mobile use, though desktop-level control is not a focus yet.

Safety has been a major concern for Google. The system has built-in protections to stop misuse, unexpected behavior, or malicious instructions from slipping through. Each action can be checked by an external safety service, and developers can require user confirmation for risky actions like purchases. Google is urging developers to test carefully before deploying agents that run unattended.

Early testers are already putting the model to work. Google has deployed it internally for projects like Project Mariner and Firebase Testing Agent. Outside companies are using it for automation, assistants, and UI testing. Some report faster and more accurate results compared to competitors, with one noting an 18 percent performance gain on complex evaluations. The model is available now in public preview, with demos on Browserbase and full documentation for developers.

Avatar of Brian Fagioli
Written by

Brian Fagioli โœ”

Technology journalist and founder of NERDS.xyz

Brian Fagioli is a technology journalist and founder of NERDS.xyz. A former BetaNews writer, he has spent over a decade covering Linux, hardware, software, cybersecurity, and AI with a no nonsense approach for real nerds.

Leave a Comment