The Ultimate Guide to OpenClaw's Web Fetch Tool

The web_fetch tool is one of OpenClaw's two lightweight web tools (alongside web_search). It allows you to fetch any public HTTP or HTTPS URL and extract its readable content, converting HTML into clean markdown or plain text. This makes it ideal for quickly pulling documentation, blog posts, or articles into your workflow without the overhead of full browser automation.

How web_fetch Works

web_fetch performs a simple HTTP GET request to the specified URL. Once the raw HTML is retrieved, it uses content extraction engines to identify and return only the main, readable portion of the page—stripping away navigation menus, ads, footers, and other non-essential elements.

By default, web_fetch uses Mozilla's Readability library for extraction. If that fails, and if Firecrawl is configured, it will fall back to Firecrawl's advanced scraping engine, which can handle JavaScript-heavy sites and offers additional options like caching.

The output is returned as structured markdown or plain text, making it perfect for ingestion into prompts, summaries, or documentation workflows.

Basic Usage

Using web_fetch is straightforward:

{
  "url": "https://example.com/article"
}

By default, it returns markdown. You can optionally specify the extraction mode and set a character limit:

{
  "url": "https://docs.openclaw.ai/intro",
  "extractMode": "text",
  "maxChars": 10000
}

When to Use web_fetch

Use web_fetch when you need to:

Retrieve documentation or help articles
Extract content from blogs or news sites
Pull in static product pages or API references
Avoid the complexity and slowdown of browser automation

web_fetch is not suitable for:

Sites that require login or authentication
Pages that rely heavily on JavaScript to load content
Interacting with dynamic UI elements (buttons, forms, etc.)

For those cases, use the Browser tool instead.

Configuration

web_fetch is enabled by default. You can customize its behavior in your OpenClaw config:

{
  "tools": {
    "web": {
      "fetch": {
        "enabled": true,
        "maxChars": 50000,
        "timeoutSeconds": 30,
        "cacheTtlMinutes": 15,
        "readability": true,
        "firecrawl": {
          "enabled": true,
          "apiKey": "your-firecrawl-key-here",
          "onlyMainContent": true
        }
      }
    }
  }
}

Key settings include:

maxChars: Limits the size of the returned content
timeoutSeconds: How long to wait before giving up on a slow site
cacheTtlMinutes: How long to cache results (to avoid repeated fetches)
firecrawl.apiKey: Enables advanced scraping with bot circumvention

Best Practices

Always check the URL first—ensure it's publicly accessible and doesn't require login.
Use caching wisely—frequently accessed pages should be cached to improve performance.
Set reasonable character limits—avoid fetching extremely long pages unless necessary.
Fall back to browser automation—if web_fetch returns incomplete or empty content, the page likely requires JavaScript.

Troubleshooting

If web_fetch returns an error or empty content:

Verify the URL is correct and publicly accessible
Check if the site blocks non-browser user agents
Try the same URL in a browser to confirm it loads
If JavaScript is required, switch to the Browser tool

Conclusion

web_fetch is a fast, efficient way to bring web content into your OpenClaw workflows. By understanding its strengths and limitations, you can use it to streamline research, documentation, and content aggregation—keeping your automations lean and reliable.

For dynamic or login-protected sites, remember to use the full Browser tool instead.