ADVERTISEMENT

No More Endless Forms: Gemini's Latest Model Can Use Your Computer; Scroll And Type For You

Think of an AI that can control your mouse and keyboard and scroll through your PC. Google Gemini 2.5 Computer Use promises just that.

<div class="paragraphs"><p>The new Gemini model can control your personal computer. (Photo: Unsplash)</p></div>
The new Gemini model can control your personal computer. (Photo: Unsplash)
Show Quick Read
Summary is AI Generated. Newsroom Reviewed

Google may have just cracked the next step of AI evolution by announcing Gemini 2.5 'Computer Use' version, which promises to take the application of artificial intelligence to a whole new level.

Gemini 2.5 Computer Use model is not designed to give replies to prompts or generate low-light, portrait images of your poorly taken selfies. Rather, it is designed to control your personal computer and complete basic front-end tasks.

While chatbots like ChatGPT, Gemini and Grok have evolved greatly to solve and ease many of the backend applications, Gemini 2.5 Computer Use model promises to learn to control your mouse and keyboard in order to complete some of the front-end tasks.

The current generation of AI tools are great for editing or generating pictures, writing letters, resumes or even doing calculations. For a more specific set of roles, it relies on Application Programming Interfaces (APIs).

However, they still cannot carry out more complex tasks that require an individual to identify visual patterns - something that the current generation of models still do not fully understand.

Gemini 2.5 Computer Use aims to solve this gap by actually performing visible actions on the screen. Think of it as your personal assistant which can use your mouse and keyboard. It can fill out your forms, navigate login screens, scroll through your browser and essentially understand everything that is being shown on the screen.

Potential use-cases can go beyond just these. Think of an AI that can submit your documents, book an appointment and solve real-world workflow problems.

Google has said its model has operated with higher efficiency even with lower latency. The model has also shown a lot of promise on mobile devices.

For now, the model is available in preview stage via the API in Google AI Studio and Vertex AI.

Opinion
Elon Musk's 'Wikipedia'? Grokipedia Version 0.1 Coming Up In 2 Weeks: How Will It Help You?
OUR NEWSLETTERS
By signing up you agree to the Terms & Conditions of NDTV Profit