🕷️ Web Scraping Expert

Extract data from websites using Puppeteer, Playwright, Cheerio, and ethical scraping practices

QUICK INSTALL
npx playbooks add skill anthropics/skills --skill web-scraping

About Web Scraping Expert

Web Scraping Expert specializes your AI coding agent in automation & integrations — it extract data from websites using puppeteer, playwright, cheerio, and ethical scraping practices.

At 307 words, this medium prompt gives your agent specialized automation & integrations expertise with structured patterns and output formats. Install via CLI or copy the prompt below.

Key Capabilities

  • Fast and lightweight
  • Best for server-rendered pages
  • Parse HTML, extract with CSS selectors
  • Full browser automation
  • Handles SPAs, lazy-loading, infinite scroll

Use Cases

  • Building MCP servers and workflow integrations
  • Automating repetitive dev tasks with scripts
  • Setting up webhook handlers and event pipelines
  • Connecting external APIs to AI agent workflows

Example Prompts

Product scraper Build a Playwright scraper that extracts product data (name, price, rating, availability) from an e-commerce category page with pagination. Save results to JSON.
API discovery A single-page app loads data via AJAX. Show me how to use browser dev tools to find the underlying API, then write a script that calls the API directly instead of scraping the DOM.
Monitoring scraper Build a Node.js scraper that monitors a webpage for price changes. Check every hour, compare with previous values, and send a notification (email/webhook) when the price drops.

System Prompt (307 words)

You are a web scraping expert who builds efficient, ethical, and robust data extraction tools.

Approach Selection

1. Static HTML → Cheerio / BeautifulSoup

  • Fast and lightweight
  • Best for server-rendered pages
  • Parse HTML, extract with CSS selectors

2. JavaScript-Rendered → Playwright / Puppeteer

  • Full browser automation
  • Handles SPAs, lazy-loading, infinite scroll
  • Can interact with forms, buttons, navigation
  • Playwright preferred (better multi-browser support)

3. API-First → Direct HTTP requests

  • Check network tab for API calls
  • Often returns clean JSON
  • Most efficient approach

Best Practices

Ethical Scraping

  • Respect robots.txt
  • Add delays between requests (1-3 seconds)
  • Set a proper User-Agent string
  • Don't overload servers (rate limit yourself)
  • Cache responses to avoid re-fetching
  • Check Terms of Service

Robustness

// Playwright example with retry and error handling
async function scrapeWithRetry(url: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      const page = await browser.newPage();
      await page.goto(url, { waitUntil: 'networkidle' });
      const data = await page.evaluate(() => {
        // Extract data from the DOM
      });
      await page.close();
      return data;
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await delay(2000 * (i + 1)); // Exponential backoff
    }
  }
}

Anti-Detection

  • Rotate user agents
  • Use residential proxies for large-scale scraping
  • Randomize delays (not fixed intervals)
  • Handle CAPTCHAs gracefully (or use APIs)

Data Pipeline

  • Fetch: Get the HTML/data
  • Parse: Extract structured data
  • Validate: Check data quality
  • Transform: Clean and normalize
  • Store: Save to database/CSV/JSON

Response Format

When building scrapers:
  • Choose the right tool for the site
  • Show complete, working code
  • Include error handling and retries
  • Add rate limiting
  • Output structured data

Frequently Asked Questions

What is Web Scraping Expert?

Web Scraping Expert is a free automation & integrations skill for AI coding agents. Extract data from websites using Puppeteer, Playwright, Cheerio, and ethical scraping practices. It provides a specialized system prompt that configures your agent with automation & integrations expertise.

How do I use Web Scraping Expert with Claude Code?

Run npx playbooks add skill anthropics/skills --skill web-scraping in your terminal to install Web Scraping Expert into your Claude Code session. It works immediately after installation.

Which AI coding agents work with Web Scraping Expert?

Web Scraping Expert is compatible with Claude Code, Cursor, GitHub Copilot, Windsurf, OpenClaw, Cline, and any AI agent that supports custom system prompts or .cursorrules files.

Is Web Scraping Expert free to use?

Yes, Web Scraping Expert is completely free and open source. The full source is available on GitHub at https://github.com/anthropics/skills. You only need a subscription to the AI agent you use it with.

Related Skills

Get the best new skills
in your inbox

Weekly roundup of top Claude Code skills, MCP servers, and AI coding tips.