microsoft/OmniParser
OmniParser
A simple screen parsing tool towards pure vision based GUI agent
Builder

Microsoft
microsoft • big-tech
Stars
24,596
Using upstream star count
Forks
2,157
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Sep 20, 2024
Project creation date
README Summary
OmniParser is a screen parsing tool designed to enable pure vision-based GUI automation agents. It processes screenshots to identify and extract interactive elements, text, and UI components without requiring access to underlying code or accessibility APIs. The tool serves as a foundation for building automated agents that can interact with graphical user interfaces using only visual information.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 7 months ago
7 Days
0
30 Days
0
90 Days
0
Quality
research- Quality
- medium
- Maturity
- research
Categories
PM Skills
Languages
Timeline
- Project created
- Sep 20, 2024
- Forked
- Mar 13, 2026
- Your last push
- 7 months ago
- Upstream last push
- 7 months ago
- Tracked since
- Sep 12, 2025
Similar Repos
pgvector cosine similarity · $0
Loading…