Open Source App Automation for Claude

Test Any App Like Magic

The ghost hand that automates your apps on macOS, Windows, and Linux. Click buttons, type text, navigate UI elements, and capture screenshots with natural language through Claude Code or Claude Desktop.

 brew install --cask geisterhand-io/tap/geisterhand

Get Started View on GitHub

    Terminal 
 $ geisterhand run Calculator
 {"port":49152,"pid":12345,"app":"Calculator","host":"127.0.0.1"}

 You: Click 7, then +, then 3, then =. What's the result?

 Claude: The result is 10. I used the accessibility API to find and press buttons by their labels, then captured a screenshot to verify. 

How It Works

Get started in three simple steps. No complex setup, no learning curve.

Install Geisterhand

One command on macOS or Linux, one download on Windows. Sets up everything you need.

brew install --cask geisterhand-io/tap/geisterhand

Grant Permissions

Launch the app once and grant Accessibility and Screen Recording permissions when prompted.

Run

Start automating any app. The server launches scoped to your target app and outputs connection details as JSON.

geisterhand run YourApp

Built for Developers

Everything you need to automate app testing, nothing you don't.

Zero Configuration

No test scripts to write. No selectors to maintain. Just describe what you want to test in plain English and Geisterhand figures out the rest.

Semantic UI Control

Find and interact with UI elements by role, title, or label using native accessibility APIs. More reliable than coordinate-based clicking.

Lightning Screenshots

Capture full-resolution screenshots of any display using native screen capture APIs. Perfect for visual regression testing and documentation.

Claude Code & Desktop

Works with both Claude Code (CLI) and Claude Desktop via MCP. One-click integration from the menu bar app.

HTTP API & CLI

Full REST API on localhost:7676 plus a command-line tool. Integrate with any language, framework, or automation pipeline.

Native Performance

Built with Swift (macOS), .NET (Windows), and Rust (Linux) for maximum performance and minimal resource usage. No Electron, no Node.js, no overhead.

Background Automation

Send input to apps running in the background using PID-targeted events. No need to bring apps to the foreground — perfect for non-intrusive testing.

Menu Control

Discover and trigger application menu items programmatically. Navigate menus without keyboard shortcuts, even in background apps.

See It In Action

Watch Geisterhand automate a complete testing workflow in under 2 minutes.

Demo Recording

$ geisterhand testing TextEdit...

Demo Coming Soon

2:00

Quick Demo

Real App

Real Desktop Testing

No Edits

Raw Recording

Quick Start

Get up and running in under a minute.

Install Geisterhand

Terminal

 brew install --cask geisterhand-io/tap/geisterhand

This installs the menu bar app and the geisterhand CLI. Or download the DMG directly.

Grant Permissions

Launch the app once to trigger permission prompts, then grant in System Settings:

Accessibility

For keyboard and mouse control

Screen Recording

For screenshot capture

Start automating with `geisterhand run`

Launch a scoped automation server for any app:

Terminal

 $ geisterhand run Calculator
 {"port":49152,"pid":12345,"app":"Calculator","host":"127.0.0.1"}

The server auto-selects a free port, scopes all API requests to the target app, and exits when the app quits. Pass an app name, path, or identifier:

  geisterhand run Safari
 geisterhand run /Applications/Xcode.app
 geisterhand run com.apple.TextEdit
 geisterhand run Calculator --port 7676 # pin a specific port  

Then send API requests to the host and port from the JSON output.

Try it out

With the server running, send requests directly:

  # See what's on screen
 curl http://127.0.0.1:49152/accessibility/tree?format=compact

 # Click a button (use "push button" on Linux instead of "AXButton")
 curl -X POST http://127.0.0.1:49152/click/element \

  -H "Content-Type: application/json" \

  -d '{"title": "7", "role": "AXButton"}'

 # Take a screenshot
 curl http://127.0.0.1:49152/screenshot --output screen.png
 

Or let an LLM drive it. Add the testing guide to your project's CLAUDE.md and ask Claude to test your app:


"Open Calculator, click 7, then +, then 3, then =. What's the result?"

Optional: MCP Integration for Claude

For a deeper integration, add Geisterhand as an MCP server so Claude can call the API natively:

Terminal

 claude mcp add-json geisterhand \

  '{"type":"stdio","command":"npx","args":["geisterhand-mcp"]}' \

  --scope user

Requires Node.js 18+. Restart Claude after adding.

API Reference

Start with geisterhand run YourApp and send requests to the port from the JSON output. All requests use JSON with snake_case field names.

Endpoints

Method	Endpoint	Description	Parameters
GET	/status	System info, permissions, frontmost app, screen size	—
GET	/health	Health check	—
GET	/screenshot	Capture screen as base64 PNG	app?, windowId?, format?, display?
POST	/click	Click at screen coordinates	x, y, button?, click_count?, modifiers?
POST	/click/element	Click element by semantic properties	title?, role?, label?, pid?, use_accessibility_action?
POST	/type	Type text at cursor position	text, delay_ms?, pid?, path?, role?, title?
POST	/key	Press key with modifiers	key, modifiers?, pid?, path?
POST	/scroll	Scroll at position	x?, y?, delta_x?, delta_y?, pid?, path?
POST	/wait	Wait for UI condition	title?, role?, condition, timeout_ms?, poll_interval_ms?
GET	/menu	Get app menu structure	app
POST	/menu	Trigger menu item	app, path, background?
GET	/accessibility/tree	Get UI element hierarchy	pid?, maxDepth?, format?, includeActions?
GET	/accessibility/elements	Find elements by criteria	role?, title?, titleContains?, labelContains?, valueContains?, pid?, maxResults?
GET	/accessibility/focused	Get focused element	pid?
POST	/accessibility/action	Perform action on element	path, action, value?

Click element by title

  POST /click/element
 {

  "title": "Submit",

  "role": "AXButton",

  "pid": 12345
 }  

Keyboard shortcut (Cmd+S)

  POST /key
 {

  "key": "s",

  "modifiers": ["cmd"]
 }  

Find buttons by label

  GET /accessibility/elements

  ?role=AXButton

  &labelContains=Submit

Returns elements with path, frame, and actions

Press a button by path

  POST /accessibility/action
 {

  "path": {

    "pid": 12345,

    "path": [0, 0, 2, 5]

  },

  "action": "press"
 }  

Special Keys

return, tab, space, escape, delete, up, down, left, right, home, end, pageup, pagedown, f1-f12

Modifiers

cmd, ctrl, alt, shift, fn, super

Common Roles

AXButton, AXTextField, AXTextArea, AXCheckBox, AXPopUpButton, AXMenuItem, AXStaticText, AXLink

Actions

press, setValue, focus, confirm, cancel, increment, decrement, showMenu, pick

Recommended Workflow

Launch

geisterhand run YourApp to start the scoped server

Inspect

GET /accessibility/tree?format=compact to see the UI

Interact

Click elements, type text, press keys, trigger menus

Verify

GET /screenshot or read element values to confirm

Wait

POST /wait for UI changes instead of sleep

Prefer /click/element over coordinate-based /click. Semantic selectors are more reliable than screen positions.

Linux note: AT-SPI2 uses different role names — push button, text, label instead of macOS AXButton, AXTextField, AXStaticText.

macOS Testing Guide

Full reference with recipes, best practices, and a CLAUDE.md snippet for your project

macOS API Documentation

Complete endpoint reference with curl examples

Windows Documentation

Setup guide and API reference for Geisterhand on Windows

Linux Documentation

Setup guide and API reference for Geisterhand on Linux

Use Cases

From quick smoke tests to comprehensive automation, Geisterhand adapts to your workflow.

QA Automation

Automate repetitive QA tasks without writing test scripts. Claude uses the Accessibility API to find and interact with UI elements by their semantic properties.

Test form validation Verify button states Fill and submit forms

Regression Testing

Catch UI regressions before your users do. Geisterhand captures high-resolution screenshots for visual comparison across builds.

Compare screenshots Detect layout shifts Verify theming

Workflow Automation

Automate multi-step workflows across applications. Open apps via Spotlight, navigate menus, fill forms, and save files to specific locations.

Cross-app workflows Data entry tasks Batch operations

Accessibility Audits

Explore your app's UI tree and verify accessibility labels. Find elements by role, check focus order, and test keyboard navigation.

Verify AX labels Test keyboard nav Check focus order

Ready to automate your app testing?

Get Started Read the Docs

Test Any App Like Magic

How It Works

Install Geisterhand

Grant Permissions

Run

Built for Developers

Zero Configuration

Semantic UI Control

Lightning Screenshots

Claude Code & Desktop

HTTP API & CLI

Native Performance

Background Automation

Menu Control

See It In Action

Quick Start

Install Geisterhand

Grant Permissions

Start automating with geisterhand run

Try it out

Optional: MCP Integration for Claude

API Reference

Endpoints

Click element by title

Keyboard shortcut (Cmd+S)

Find buttons by label

Press a button by path

Special Keys

Modifiers

Common Roles

Actions

Recommended Workflow

Use Cases

QA Automation

Regression Testing

Workflow Automation

Accessibility Audits

Start automating with `geisterhand run`