Playwright CLI Setup Guide

Playwright CLI is a Claude Code skill that lets agents automate browser interactions — navigate pages, take screenshots, fill forms, and extract content. It's useful for capturing visual context, verifying web UI changes, and testing browser-based features during agent workflows.

Why Use It

AI agents work with code, but sometimes they need to see or interact with what the code produces. Common use cases with Thrum:

Installation

Playwright CLI runs as a Claude Code skill. There are two ways to set it up:

Option 1: Playwright CLI Skill (Recommended)

If you have the playwright-cli binary installed, you can use it via a Claude Code skill:

  1. Install the skill:

    playwright-cli install --skills
  2. Allow the skill in your project's .claude/settings.local.json:

    {
      "permissions": {
        "allow": ["Skill(playwright-cli)"]
      }
    }
  3. The skill gives agents access to Bash(playwright-cli:*) commands.

Option 2: Playwright MCP Plugin (Alternative)

The official Playwright MCP plugin from Microsoft provides browser tools directly to Claude Code. Note: this approach sends screenshots as base64 inline images, which consumes significantly more tokens than the CLI skill. The CLI skill is preferred for agent workflows.

  1. Install the plugin in Claude Code:

    claude plugin add playwright
  2. The plugin automatically registers MCP tools like browser_navigate, browser_click, browser_snapshot, and browser_take_screenshot.

  3. No additional configuration needed — the plugin runs npx @playwright/mcp@latest as the MCP server.

Core Commands

Whether using the MCP plugin or the CLI skill, the capabilities are similar:

Navigation

# Open a URL
playwright-cli open https://localhost:3000

# Navigate to a page
playwright-cli goto https://localhost:3000/dashboard

# Go back
playwright-cli back

Screenshots

# Screenshot the current viewport
playwright-cli screenshot

# Screenshot a specific element
playwright-cli screenshot --selector ".dashboard-header"

# Full-page screenshot
playwright-cli screenshot --full-page

# Save to a specific file
playwright-cli screenshot --filename dashboard.png

Page Inspection

# Get an accessibility snapshot (structured page content)
playwright-cli snapshot

# Evaluate JavaScript on the page
playwright-cli eval "document.title"
playwright-cli eval "document.querySelectorAll('.task-card').length"

Interaction

# Click an element (by reference from snapshot)
playwright-cli click e3

# Fill a text field
playwright-cli fill e5 "search query"

# Type text (triggers key handlers)
playwright-cli type "hello world"

# Press a key
playwright-cli press Enter

DevTools

# View console messages
playwright-cli console

# View network requests
playwright-cli network

# Start/stop tracing
playwright-cli tracing-start
playwright-cli tracing-stop --output trace.zip

Usage with Thrum Agents

A common pattern for agents working on web UI tasks:

# 1. Agent claims a UI task
bd update <id> --status=in_progress
thrum send "Starting UI task <id>" --to @coordinator

# 2. Open the app and capture the "before" state
playwright-cli open http://localhost:65018
playwright-cli screenshot --filename before.png

# 3. Make code changes...

# 4. Reload and capture the "after" state
playwright-cli goto http://localhost:65018
playwright-cli screenshot --filename after.png

# 5. Verify the change visually
playwright-cli snapshot    # check accessibility tree

# 6. Close and report
playwright-cli close
bd close <id>
thrum send "Completed <id> — UI verified with screenshots" --to @coordinator

Multiple Browser Sessions

Use named sessions to manage multiple browser windows:

# Open two sessions
playwright-cli -s=app open http://localhost:3000
playwright-cli -s=docs open http://localhost:8080

# Interact with a specific session
playwright-cli -s=app screenshot
playwright-cli -s=docs click e5

# Close a specific session
playwright-cli -s=app close

Tips

Further Reading