An MCP server for automatic on-screen GUI operations, enabling AI models to control GUI elements via MCP.
Loading more......
An MCP server that executes commands like keyboard input and mouse movement, facilitating desktop automation via MCP.
An MCP server paired with a browser extension that allows LLMs to control Firefox browsers.
MCP servers for automating and controlling the user's browser using existing browser profiles, ideal for tasks requiring logged-in sessions and avoiding bot detection. Includes implementations by browsermcp and modelcontext.
An MCP server that enables desktop automation, including mouse control, keyboard input, screenshots, OCR, and window management, allowing LLMs to interact with graphical user interfaces.
A general-purpose MCP Server that can execute any command with run_command and run_script tools, demonstrating the flexible server capabilities under the MCP standard.
MCP server to control HarmonyOS-next devices with AI, supporting device control and UI automation.
Category: Code Execution Automation MCP Servers
Tags: automation, gui, ai-integration, mcp, open-source
omniparser-autogui-mcp is an open-source MCP server that enables automatic operation of on-screen GUI elements. It uses OmniParser to analyze the screen and allows AI models to control GUI elements programmatically via the MCP interface. It is primarily confirmed to work on Windows.
OMNI_PARSER_BACKEND_LOAD: Specify for compatibility with certain clients.TARGET_WINDOW_NAME: Target specific window for GUI operations.OMNI_PARSER_SERVER: Offload OmniParser processing to another device.SSE_HOST, SSE_PORT: Enable communication via SSE instead of stdio.SOM_MODEL_PATH, CAPTION_MODEL_NAME, CAPTION_MODEL_PATH, OMNI_PARSER_DEVICE, BOX_TRESHOLD: Advanced OmniParser configuration.git clone --recursive https://github.com/NON906/omniparser-autogui-mcp.gituv for syncing dependencies.