Quick start
The fastest way to use deskctl is to follow the same four-step loop : observe, wait, act, verify.
1. Install and diagnose
npm install -g deskctl
deskctl doctor
Run deskctl doctor first. It checks X11 connectivity, basic enumeration,
screenshot viability, and socket health before you start driving the desktop.
2. Observe the desktop
deskctl snapshot --annotate
deskctl list-windows
deskctl get active-window
deskctl get monitors
Use snapshot when you want a screenshot artifact plus window refs. Use
list-windows when you only need the current window tree without writing a
screenshot.
3. Pick selectors that stay readable
Prefer explicit selectors when you need deterministic targeting:
ref=w1
id=win1
title=Chromium
class=chromium
focused
Legacy refs such as @w1 still work after snapshot or list-windows. Bare
strings like chromium are fuzzy matches and now fail on ambiguity.
4. Wait, act, verify
The core loop is:
# observe
deskctl snapshot --annotate
# wait
deskctl wait window --selector 'title=Chromium' --timeout 10
# act
deskctl focus 'title=Chromium'
deskctl hotkey ctrl l
deskctl type "https://example.com"
deskctl press enter
# verify
deskctl wait focus --selector 'title=Chromium' --timeout 5
deskctl snapshot
The wait commands return the matched window payload on success, so they compose cleanly into the next action.
5. Use --json when parsing matters
Every command supports --json and uses the same top-level envelope:
{
"success": true,
"data": {
"screenshot": "/tmp/deskctl-1234567890.png",
"windows": [
{
"ref_id": "w1",
"window_id": "win1",
"title": "Chromium",
"app_name": "chromium",
"x": 0,
"y": 0,
"width": 1920,
"height": 1080,
"focused": true,
"minimized": false
}
]
}
}
Use window_id for stable targeting inside a live daemon session. The exact
text formatting is intentionally compact, but JSON is the parsing contract.
The full stable-vs-best-effort contract lives on the runtime contract page.