---
description: Advanced desktop automation with mouse, keyboard, and screen control
---
# Desktop Control Skill
**The most advanced desktop automation skill for OpenClaw.** Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.
## 🎯 Features
### Mouse Control
- ✅ **Absolute positioning** - Move to exact coordinates
- ✅ **Relative movement** - Move from current position
- ✅ **Smooth movement** - Natural, human-like mouse paths
- ✅ **Click types** - Left, right, middle, double, triple clicks
- ✅ **Drag & drop** - Drag from point A to point B
- ✅ **Scroll** - Vertical and horizontal scrolling
- ✅ **Position tracking** - Get current mouse coordinates
### Keyboard Control
- ✅ **Text typing** - Fast, accurate text input
- ✅ **Hotkeys** - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)
- ✅ **Special keys** - Enter, Tab, Escape, Arrow keys, F-keys
- ✅ **Key combinations** - Multi-key press combinations
- ✅ **Hold & release** - Manual key state control
- ✅ **Typing speed** - Configurable WPM (instant to human-like)
### Screen Operations
- ✅ **Screenshot** - Capture entire screen or regions
- ✅ **Image recognition** - Find elements on screen (via OpenCV)
- ✅ **Color detection** - Get pixel colors at coordinates
- ✅ **Multi-monitor** - Support for multiple displays
### Window Management
- ✅ **Window list** - Get all open windows
- ✅ **Activate window** - Bring window to front
- ✅ **Window info** - Get position, size, title
- ✅ **Minimize/Maximize** - Control window states
### Safety Features
- ✅ **Failsafe** - Move mouse to corner to abort
- ✅ **Pause control** - Emergency stop mechanism
- ✅ **Approval mode** - Require confirmation for actions
- ✅ **Bounds checking** - Prevent out-of-screen operations
- ✅ **Logging** - Track all automation actions
---
## 🚀 Quick Start
### Installation
First, install required dependencies:
```bash
pip install pyautogui pillow opencv-python pygetwindow
```
### Basic Usage
```python
from skills.desktop_control import DesktopController
# Initialize controller
dc = DesktopController(failsafe=True)
# Mouse operations
dc.move_mouse(500, 300) # Move to coordinates
dc.click() # Left click at current position
dc.click(100, 200, button="right") # Right click at position
# Keyboard operations
dc.type_text("Hello from OpenClaw!")
dc.hotkey("ctrl", "c") # Copy
dc.press("enter")
# Screen operations
screenshot = dc.screenshot()
position = dc.get_mouse_position()
```
---
## 📋 Complete API Reference
### Mouse Functions
#### `move_mouse(x, y, duration=0, smooth=True)`
Move mouse to absolute screen coordinates.
**Parameters:**
- `x` (int): X coordinate (pixels from left)
- `y` (int): Y coordinate (pixels from top)
- `duration` (float): Movement time in seconds (0 = instant, 0.5 = smooth)
- `smooth` (bool): Use bezier curve for natural movement
**Example:**
```python
# Instant movement
dc.move_mouse(1000, 500)
# Smooth 1-second movement
dc.move_mouse(1000, 500, duration=1.0)
```
#### `move_relative(x_offset, y_offset, duration=0)`
Move mouse relative to current position.
**Parameters:**
- `x_offset` (int): Pixels to move horizontally (positive = right)
- `y_offset` (int): Pixels to move vertically (positive = down)
- `duration` (float): Movement time in seconds
**Example:**
```python
# Move 100px right, 50px down
dc.move_relative(100, 50, duration=0.3)
```
#### `click(x=None, y=None, button='left', clicks=1, interval=0.1)`
Perform mouse click.
**Parameters:**
- `x, y` (int, optional): Coordinates to click (None = current position)
- `button` (str): 'left', 'right', 'middle'
- `clicks` (int): Number of clicks (1 = single, 2 = double)
- `interval` (float): Delay between multiple clicks
**Example:**
```python
# Simple left click
dc.click()
# Double-click at specific position
dc.click(500, 300, clicks=2)
# Right-click
dc.click(button='right')
```
#### `drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')`
Drag and drop operation.
**Parameters:**
- `start_x, start_y` (int): Starting coordinates
- `end_x, end_y` (int): Ending coordinates
- `duration` (float): Drag duration
- `button` (str): Mouse button to use
**Example:**
```python
# Drag file from desktop to folder
dc.drag(100, 100, 500, 500, duration=1.0)
```
#### `scroll(clicks, direction='vertical', x=None, y=None)`
Scroll mouse wheel.
**Parameters:**
- `clicks` (int): Scroll amount (positive = up/left, negative = down/right)
- `direction` (str): 'vertical' or 'horizontal'
- `x, y` (int, optional): Position to scroll at
**Example:**
```python
# Scroll down 5 clicks
dc.scroll(-5)
# Scroll up 10 clicks
dc.scroll(10)
# Horizontal scroll
dc.scroll(5, direction='horizontal')
```
#### `get_mouse_position()`
Get current mouse coordinates.
**Returns:** `(x, y)` tuple
**Example:**
```python
x, y = dc.get_mouse_position()
print(f"Mouse is at: {x}, {y}")
```
---
### Keyboard Functions
#### `type_text(text, interval=0, wpm=None)`
Type text with configurable speed.
**Parameters:**
- `text` (str): Text to type
- `interval` (float): Delay between keystrokes (0 = instant)
- `wpm` (int, optional): Words per minute (overrides interval)
**Example:**
```python
# Instant typing
dc.type_text("Hello World")
# Human-like typing at 60 WPM
dc.type_text("Hello World", wpm=60)
# Slow typing with 0.1s between keys
dc.type_text("Hello World", interval=0.1)
```
#### `press(key, presses=1, interval=0.1)`
Press and release a key.
**Parameters:**
- `key` (str): Key name (see Key Names section)
- `presses` (int): Number of times to press
- `interval` (float): Delay between presses
**Example:**
```python
# Press Enter
dc.press('enter')
# Press Space 3 times
dc.press('space', presses=3)
# Press Down arrow
dc.press('down')
```
#### `hotkey(*keys, interval=0.05)`
Execute keyboard shortcut.
**Parameters:**
- `*keys` (str): Keys to press together
- `interval` (float): Delay between key presses
**Example:**
```python
# Copy (Ctrl+C)
dc.hotkey('ctrl', 'c')
# Paste (Ctrl+V)
dc.hotkey('ctrl', 'v')
# Open Run dialog (Win+R)
dc.hotkey('win', 'r')
# Save (Ctrl+S)
dc.hotkey('ctrl', 's')
# Select All (Ctrl+A)
dc.hotkey('ctrl', 'a')
```
#### `key_down(key)` / `key_up(key)`
Manually control key state.
**Example:**
```python
# Hold Shift
dc.key_down('shift')
dc.type_text("hello") # Types "HELLO"
dc.key_up('shift')
# Hold Ctrl and click (for multi-select)
dc.key_down('ctrl')
dc.click(100, 100)
dc.click(200, 100)
dc.key_up('ctrl')
```
---
### Screen Functions
#### `screenshot(region=None, filename=None)`
Capture screen or region.
**Parameters:**
- `region` (tuple, optional): (left, top, width, height) for partial capture
- `filename` (str, optional): Path to save image
**Returns:** PIL Image object
**Example:**
```python
# Full screen
img = dc.screenshot()
# Save to file
dc.screenshot(filename="screenshot.png")
# Capture specific region
img = dc.screenshot(region=(100, 100, 500, 300))
```
#### `get_pixel_color(x, y)`
Get color of pixel at coordinates.
**Returns:** RGB tuple `(r, g, b)`
**Example:**
```python
r, g, b = dc.get_pixel_color(500, 300)
print(f"Color at (500, 300): RGB({r}, {g}, {b})")
```
#### `find_on_screen(image_path, confidence=0.8)`
Find image on screen (requires OpenCV).
**Parameters:**
- `image_path` (str): Path to template image
- `confidence` (float): Match threshold (0-1)
**Returns:** `(x, y, width, height)` or None
**Example:**
```python
# Find button on screen
location = dc.find_on_screen("button.png")
if location:
x, y, w, h = location
# Click center of found image
dc.click(x + w//2, y + h//2)
```
#### `get_screen_size()`
Get screen resolution.
**Returns:** `(width, height)` tuple
**Example:**
```python
width, height = dc.get_screen_size()
print(f"Screen: {width}x{height}")
```
---
### Window Functions
#### `get_all_windows()`
List all open windows.
**Returns:** List of window titles
**Example:**
```python
windows = dc.get_all_windows()
for title in windows:
print(f"Window: {title}")
```
#### `activate_window(title_substring)`
Bring window to front by title.
**Parameters:**
- `title_substring` (str): Part of window title to match
**Example:**
```python
# Activate Chrome
dc.activate_window("Chrome")
# Activate VS Code
dc.activate_window("Visual Studio Code")
```
#### `get_active_window()`
Get currently focused window.
**Returns:** Window title (str)
**Example:**
```python
active = dc.get_active_window()
print(f"Active window: {active}")
```
---
### Clipboard Functions
#### `copy_to_clipboard(text)`
Copy text to clipboard.
**Example:**
```python
dc.copy_to_clipboard("Hello from OpenClaw!")
```
#### `get_from_clipboard()`
Get text from clipboard.
**Returns:** str
**Example:**
```python
text = dc.get_from_clipboard()
print(f"Clipboard: {text}")
```
---
## ⌨️ Key Names Reference
### Alphabet Keys
`'a'` through `'z'`
### Number Keys
`'0'` through `'9'`
### Function Keys
`'f1'` through `'f24'`
### Special Keys
- `'enter'` / `'return'`
- `'esc'` / `'escape'`
- `'space'` / `'spacebar'`
- `'tab'`
- `'backspace'`
- `'delete'` / `'del'`
- `'insert'`
- `'home'`
- `'end'`
- `'pageup'` / `'pgup'`
- `'pagedown'` / `'pgdn'`
### Arrow Keys
- `'up'` / `'down'` / `'left'` / `'right'`
### Modifier Keys
- `'ctrl'` / `'control'`
- `'shift'`
- `'alt'`
- `'win'` / `'winleft'` / `'winright'`
- `'cmd'` / `'command'` (Mac)
### Lock Keys
- `'capslock'`
- `'numlock'`
- `'scrolllock'`
### Punctuation
- `'.'` / `','` / `'?'` / `'!'` / `';'` / `':'`
- `'['` /
... (truncated)AI advertising agents that automates ad campaigns across Google Ads, Meta Ads, LinkedIn Ads, and TikTok Ads. Creates campaigns, reads live performance data, researches keywords with real CPC data, optimizes budgets, and manages ads through natural language via the Adspirer MCP server. 103 tools across 4 ad platforms.
Self-orchestrating multi-agent development workflows.
Complete guide for creating and deploying browser automation functions
Comprehensive guide for building AI workflows, agents