โ† Back to Articles
General3 min read

understanding-openclaw-canvas-ui-automation

ClawMakers Teamยท

Understanding OpenClaw Canvas: UI Automation for Mobile & Desktop

The OpenClaw Canvas is a powerful tool for automating graphical user interfaces on both mobile and desktop devices. Unlike full browser automation, Canvas provides direct access to the visual layer of paired devices, enabling pixel-perfect interaction with native apps, system interfaces, and custom GUI applications.

What is the Canvas?

The Canvas is a GUI automation surface that renders the screen of a paired node (mobile or desktop) within the OpenClaw ecosystem. It allows you to programmatically interact with any visible element, making it ideal for tasks that require visual feedback or access to non-web applications.

Key capabilities include:

  • Rendering the UI of a paired device
  • Taking screenshots of the current state
  • Evaluating JavaScript in the context of the Canvas
  • Pushing A2UI elements to create hybrid interfaces
  • Automating interactions based on visual elements

Core Commands

canvas present

Displays the Canvas interface for a paired node. This is the first step in any Canvas automation workflow.

{
  "action": "present",
  "node": "iPhone12"
}

canvas snapshot

Captures the current state of the Canvas and returns an accessibility-tree-like representation that can be used to identify interactive elements by role, label, or position.

{
  "action": "snapshot",
  "node": "MacBookPro"
}

canvas eval

Executes JavaScript code within the Canvas environment, allowing for dynamic manipulation of the displayed content or execution of complex logic.

{
  "action": "eval",
  "javaScript": "document.getElementById('button').click();"
}

canvas a2ui_push

Pushes A2UI (Agent-to-UI) elements to the Canvas, enabling the creation of custom interfaces that blend agent logic with user interaction.

Common Use Cases

Mobile App Automation

Automate native mobile applications that don't have APIs or web interfaces. This is particularly useful for apps that require visual verification or have complex gesture-based interactions.

Desktop UI Testing

Perform automated testing of desktop applications by simulating user interactions and validating visual outputs.

Hybrid Interfaces

Create interfaces that combine the power of AI agents with traditional UI elements, allowing users to interact with agents through familiar visual controls.

Accessibility Automation

Automate tasks for users with accessibility needs by creating custom navigation flows that work with screen readers and other assistive technologies.

Security Considerations

Canvas operations require explicit permission from the user, as they provide direct access to the device's display. Always ensure that Canvas automation is performed in a secure context and that sensitive information is not exposed through screenshots or logs.

Getting Started

  1. Pair your device with OpenClaw
  2. Use nodes status to verify the connection
  3. Call canvas present to initialize the Canvas
  4. Use canvas snapshot to analyze the current UI state
  5. Perform actions using canvas act with the appropriate element references

The Canvas opens up new possibilities for automation beyond what's possible with browser-based tools alone, making OpenClaw a truly versatile platform for AI-driven task automation.

Enjoyed this article?

Join the ClawMakers community to discuss this and more with fellow builders.

Join on Skool โ€” It's Free โ†’