Library/computer-use-preview
Library/computer-use-previewForked

google-gemini/computer-use-preview

computer-use-preview

This repository provides a preview implementation of computer use capabilities with Google Gemini, allowing AI models to interact with computer interfaces through automated actions.

Builder

Google Gemini

Google Gemini

google-gemini • big-tech

Stars

2,887

Using upstream star count

Forks

371

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

May 6, 2025

Project creation date

README Summary

This repository provides a preview implementation of computer use capabilities with Google Gemini, allowing AI models to interact with computer interfaces through automated actions. It includes tools and examples for enabling AI agents to perform tasks like clicking, typing, and navigating desktop applications. The implementation demonstrates how to integrate Gemini's multimodal capabilities with computer automation frameworks.

AI Dev Skills

Unmapped

Computer VisionMultimodal AIGUI AutomationVision-Language ModelsRobotic Process AutomationAgent-based Systems

Tags

Computer VisionMultimodal AIGUI AutomationVision-Language ModelsRobotic Process AutomationAgent-based SystemsAI AutomationDesktop ApplicationsTextMultimodalMultimodal ReasoningGUI TestingDesktop AutomationComputer Interface ControlImageAgentic AISelf-hostedProcess AutomationPython

Taxonomy

Recent Activity

Updated 1 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
low
Maturity
research

Categories

Multimodal AIPrimaryOther AI / MLFoundation ModelsAI AgentsComputer VisionRoboticsDev Tools & Automation

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
May 6, 2025
Forked
Mar 13, 2026
Your last push
1 months ago
Upstream last push
7 days ago
Tracked since
Feb 18, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…