Quick start

The fastest way to use deskctl is to follow the same four-step loop : observe, wait, act, verify.

1. Install and diagnose

npm install -g deskctl
deskctl doctor

Run deskctl doctor first. It checks X11 connectivity, basic enumeration, screenshot viability, and socket health before you start driving the desktop.

2. Observe the desktop

deskctl snapshot --annotate
deskctl list-windows
deskctl get active-window
deskctl get monitors

Use snapshot when you want a screenshot artifact plus window refs. Use list-windows when you only need the current window tree without writing a screenshot.

3. Pick selectors that stay readable

Prefer explicit selectors when you need deterministic targeting:

ref=w1
id=win1
title=Chromium
class=chromium
focused

Legacy refs such as @w1 still work after snapshot or list-windows. Bare strings like chromium are fuzzy matches and now fail on ambiguity.

4. Wait, act, verify

The core loop is:

# observe
deskctl snapshot --annotate

# wait
deskctl wait window --selector 'title=Chromium' --timeout 10

# act
deskctl focus 'title=Chromium'
deskctl hotkey ctrl l
deskctl type "https://example.com"
deskctl press enter

# verify
deskctl wait focus --selector 'title=Chromium' --timeout 5
deskctl snapshot

The wait commands return the matched window payload on success, so they compose cleanly into the next action.

5. Use --json when parsing matters

Every command supports --json and uses the same top-level envelope:

{
  "success": true,
  "data": {
    "screenshot": "/tmp/deskctl-1234567890.png",
    "windows": [
      {
        "ref_id": "w1",
        "window_id": "win1",
        "title": "Chromium",
        "app_name": "chromium",
        "x": 0,
        "y": 0,
        "width": 1920,
        "height": 1080,
        "focused": true,
        "minimized": false
      }
    ]
  }
}

Use window_id for stable targeting inside a live daemon session. The exact text formatting is intentionally compact, but JSON is the parsing contract.

The full stable-vs-best-effort contract lives on the runtime contract page.