選択できるのは25トピックまでです。 トピックは、先頭が英数字で、英数字とダッシュ('-')を使用した35文字以内のものにしてください。

playwright-cli.md 15KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280
  1. # Playwright CLI — Browser Automation for Coding Agents
  2. ## Principle
  3. When an AI agent needs to look at a webpage — take a snapshot, grab selectors, capture a screenshot — it shouldn't have to load thousands of tokens of DOM trees and tool schemas into its context window just to do that. Playwright CLI gives the agent a lightweight way to talk to a browser through simple shell commands, keeping the context window free for reasoning and code generation.
  4. ## Rationale
  5. Playwright MCP is powerful, but it's heavy. Every interaction loads full accessibility trees and tool definitions into the LLM context. That's fine for complex, stateful flows where you need rich introspection. But for the common case — "open this page, tell me what's on it, take a screenshot" — it's overkill.
  6. Playwright CLI solves this by returning concise **element references** (`e15`, `e21`) instead of full DOM dumps. The result: ~93% fewer tokens per interaction, which means the agent can run longer sessions, reason more deeply, and still have context left for your actual code.
  7. **The trade-off is simple:**
  8. - **CLI** = fast, lightweight, stateless — great for quick looks at pages
  9. - **MCP** = rich, stateful, full-featured — great for complex multi-step automation
  10. TEA uses both where each shines (see `tea_browser_automation: "auto"`).
  11. ## Prerequisites
  12. ```bash
  13. npm install -g @playwright/cli@latest # Install globally (Node.js 18+)
  14. playwright-cli install --skills # Register as an agent skill
  15. ```
  16. The global npm install is one-time. Run `playwright-cli install --skills` from your project root to register skills in `.claude/skills/` (works with Claude Code, GitHub Copilot, and other coding agents). Agents without skills support can use the CLI directly via `playwright-cli --help`. TEA documents this during installation but does not run it for you.
  17. ## How It Works
  18. The agent interacts with the browser through shell commands. Each command is a single, focused action:
  19. ```bash
  20. # 1. Open a page
  21. playwright-cli -s=tea-explore open https://app.com/login
  22. # 2. Take a snapshot — returns element references, not DOM trees
  23. playwright-cli -s=tea-explore snapshot
  24. # Output: [{ref: "e15", role: "textbox", name: "Email"},
  25. # {ref: "e21", role: "textbox", name: "Password"},
  26. # {ref: "e33", role: "button", name: "Sign In"}]
  27. # 3. Interact using those references
  28. playwright-cli -s=tea-explore fill e15 "user@example.com"
  29. playwright-cli -s=tea-explore fill e21 "password123"
  30. playwright-cli -s=tea-explore click e33
  31. # 4. Capture evidence
  32. playwright-cli -s=tea-explore screenshot --filename=login-flow.png
  33. # 5. Clean up
  34. playwright-cli -s=tea-explore close
  35. ```
  36. The `-s=tea-explore` flag scopes everything to a named session, preventing state leakage between workflows.
  37. ## What TEA Uses It For
  38. **Selector verification** — Before generating test code, TEA can snapshot a page to see the actual labels, roles, and names of elements. Instead of guessing that a button says "Login", it knows it says "Sign In":
  39. ```
  40. snapshot ref {role: "button", name: "Sign In"}
  41. → generates: page.getByRole('button', { name: 'Sign In' })
  42. ```
  43. **Page discovery** — During `test-design` exploratory mode, TEA snapshots pages to understand what's actually there, rather than relying only on documentation.
  44. **Evidence collection** — During `test-review`, TEA can capture screenshots, traces, and network logs as evidence without the overhead of a full MCP session.
  45. **Agent-side test debugging** — For existing failing Playwright tests, TEA should prefer Playwright's agent-facing debug loop over ad hoc manual reproduction: `npx playwright test --debug=cli` to step through the test in CLI mode (no GUI Inspector — designed for coding agents), then `npx playwright trace ...` to inspect the resulting trace artifact from the command line. The `--debug=cli` flag (Playwright 1.59+) lets agents attach, step through execution, and inspect page state without ever opening a browser window.
  46. ## How CLI Relates to Playwright Utils and API Testing
  47. CLI and playwright-utils are **complementary tools that work at different layers**:
  48. | | Playwright CLI | Playwright Utils |
  49. | ------------ | -------------------------------------------- | ------------------------------------------------ |
  50. | **When** | During test _generation_ (the agent uses it) | During test _execution_ (your test code uses it) |
  51. | **What** | Shell commands to observe your app | Fixtures and helpers imported in test files |
  52. | **Examples** | `snapshot`, `screenshot`, `network` | `apiRequest`, `auth-session`, `network-recorder` |
  53. They work together naturally. The agent uses CLI to _understand_ your app, then generates test code that _imports_ playwright-utils:
  54. ```bash
  55. # Agent uses CLI to observe network traffic on the dashboard page
  56. playwright-cli -s=tea-discover open https://app.com/dashboard
  57. playwright-cli -s=tea-discover network
  58. # Output: GET /api/users → 200, POST /api/audit → 201, GET /api/settings → 200
  59. playwright-cli -s=tea-discover close
  60. ```
  61. ```typescript
  62. // Agent generates API tests using what it discovered, with playwright-utils
  63. import { test } from '@seontechnologies/playwright-utils/api-request/fixtures';
  64. test('GET /api/users returns user list', async ({ apiRequest }) => {
  65. const { status, body } = await apiRequest<User[]>({
  66. method: 'GET',
  67. path: '/api/users',
  68. });
  69. expect(status).toBe(200);
  70. expect(body.length).toBeGreaterThan(0);
  71. });
  72. ```
  73. **For pure API testing** (no UI involved), `playwright-cli` browser commands (snapshot, screenshot, click) don't apply — there's no page. But **trace analysis is highly valuable**. Playwright captures full network traces for API tests (requests, responses, headers, timing), and the trace CLI lets the agent inspect them programmatically:
  74. ```bash
  75. # API test fails in CI → open the trace artifact
  76. npx playwright trace open test-results/api-users/trace.zip
  77. # What HTTP call failed?
  78. npx playwright trace requests --failed
  79. # Output: #3 POST /api/users → 422 12ms
  80. # Full request/response details (headers, body, timing)
  81. npx playwright trace request 3
  82. # What assertion failed and why?
  83. npx playwright trace errors
  84. # Done
  85. npx playwright trace close
  86. ```
  87. This gives the agent the full HTTP conversation — wrong payload, expired auth token, schema mismatch, upstream 5xx — without a human opening UI mode. The agent generates API tests directly from documentation, specs, or code analysis using `apiRequest` and `recurse` from playwright-utils, and uses trace analysis to diagnose failures.
  88. **For E2E testing**, CLI shines at both ends — browser commands (snapshot, screenshot) during test generation, and trace analysis (actions, snapshots, requests) during debugging.
  89. **Bottom line:** CLI helps the agent _write better tests_. Playwright-utils helps those tests _run reliably_. Trace analysis helps the agent _fix them when they break_.
  90. ## Session Isolation
  91. Every CLI command targets a named session. This prevents workflows from interfering with each other:
  92. ```bash
  93. # Workflow A uses one session
  94. playwright-cli -s=tea-explore open https://app.com
  95. # Workflow B uses a different session (can run in parallel)
  96. playwright-cli -s=tea-verify open https://app.com/admin
  97. ```
  98. For parallel safety (multiple agents on the same machine), append a unique suffix:
  99. ```bash
  100. playwright-cli -s=tea-explore-<timestamp> open https://app.com
  101. ```
  102. ## Autonomous Trace Investigation (Playwright 1.59+)
  103. For generated tests that already exist and are failing, Playwright 1.59 introduced CLI-native debugging and trace analysis designed specifically for AI agents. Instead of downloading traces and opening the GUI Trace Viewer, agents can now consume the entire trace context directly from the command line.
  104. ### Debug a Failing Test (CLI Mode)
  105. ```bash
  106. # Start the test in CLI debug mode — no GUI Inspector, agent-friendly output
  107. npx playwright test --debug=cli
  108. playwright-cli attach <session-id>
  109. playwright-cli --session <session-id> step-over
  110. ```
  111. With `--debug=cli`, the agent can:
  112. - Step through test execution in real-time
  113. - Inspect the page's HTML source at each step
  114. - Review network calls and console logs at the moment of failure
  115. - Capture before/after snapshots without opening a browser
  116. ### Investigate a Trace Artifact
  117. ```bash
  118. # Open a trace from CI or local runs — this starts a session
  119. npx playwright trace open test-results/<run>/trace.zip
  120. # List all actions as a numbered tree (# column = 1-based ordinal)
  121. npx playwright trace actions
  122. # Output: # Time Action Duration
  123. # 1 0:00.00 navigate(...) 120ms
  124. # 2 0:00.12 fill(#email, ...) 45ms
  125. # ...
  126. # 9 0:01.50 expect(toBeVisible) ✗ 30s
  127. # Filter to failing assertions
  128. npx playwright trace actions --grep="expect"
  129. # Drill into action #9 (the ordinal from the list above)
  130. npx playwright trace action 9
  131. # See the page snapshot after that action (valid: before | input | after)
  132. npx playwright trace snapshot 9 --name after
  133. # Other useful subcommands
  134. npx playwright trace errors # errors with stack traces
  135. npx playwright trace requests --failed # failed network requests
  136. npx playwright trace console --errors-only # console errors
  137. # Close when done (removes extracted data)
  138. npx playwright trace close
  139. ```
  140. ### Autonomous Diagnostic Loop
  141. When TEA encounters a failing test in healing/review mode, the recommended investigation flow is:
  142. 1. **Run with `--debug=cli`** to step through the failure and identify the failing action
  143. 2. **Get a trace artifact** — configure `trace: 'retain-on-failure'` in `playwright.config.ts` (recommended), add `--trace=retain-on-failure` to the test run, or use an existing CI trace artifact. For `playwright-cli` sessions (not `--debug=cli`), use `tracing-start` / `tracing-stop` instead.
  144. 3. **Filter to assertions** (`trace actions --grep="expect"`) to find the failure point
  145. 4. **Inspect the snapshot** (`trace snapshot <n> --name after`) to see exact page state at failure
  146. 5. **Analyze network/console** to rule out backend issues or timing problems
  147. 6. **Propose a fix** — updated locator, added wait, or flagged flake for human review
  148. This reduces Mean Time to Repair (MTTR) by giving the agent full failure context rather than just an error message.
  149. ### When to Use Each Tool
  150. - `playwright-cli` session commands remain the best lightweight tool for page exploration and selector verification.
  151. - `npx playwright test --debug=cli` is better for stepping through an already-written failing test (agent-native, no GUI).
  152. - `npx playwright trace ...` is better for understanding flakes and assertion failures from saved artifacts.
  153. If your environment exposes the Playwright dashboard or bound-browser flow, it can help humans inspect what an agent is doing in the background, but TEA should treat that as optional observability rather than a hard dependency.
  154. ### Binding a Browser for Agent Inspection (`browser.bind()`)
  155. Playwright 1.59 added `browser.bind()` — a programmatic API that makes a running browser instance available to `playwright-cli` and MCP clients. This is the bridge between "a test is running" and "an agent can see what the test sees."
  156. ```typescript
  157. // In a test or fixture: bind the browser so playwright-cli can attach
  158. const { endpoint } = await browser.bind('my-debug-session', {
  159. workspaceDir: process.cwd(),
  160. });
  161. // Now: playwright-cli attach my-debug-session
  162. ```
  163. **When TEA uses this:**
  164. - **Debugging a complex E2E failure** — A test fixture calls `browser.bind()` before the failing scenario, then TEA runs `playwright-cli attach` to inspect live page state, network, and console without re-running the test from scratch.
  165. - **Bridging CLI and MCP** — A bound browser is accessible to both `playwright-cli` and `@playwright/mcp`. TEA's `auto` mode can start with lightweight CLI inspection and escalate to MCP if richer introspection is needed, all against the same browser instance.
  166. - **CI artifact enhancement** — A CI helper can bind the browser during test runs, letting a post-failure agent attach and investigate before the process exits.
  167. Call `await browser.unbind()` when done to release the session (async — must be awaited).
  168. ## Command Quick Reference
  169. | What you want to do | Command |
  170. | ------------------------- | ------------------------------------------------ |
  171. | Open a page | `open <url>` |
  172. | See what's on the page | `snapshot` |
  173. | Take a screenshot | `screenshot [--filename=path]` |
  174. | Click something | `click <ref>` |
  175. | Type into a field | `fill <ref> <text>` |
  176. | Navigate | `goto <url>`, `go-back`, `reload` |
  177. | Mock a network request | `route <pattern> --status=200 --body='...'` |
  178. | Start recording a trace | `tracing-start` |
  179. | Stop and save the trace | `tracing-stop` |
  180. | Save auth state for reuse | `state-save auth.json` |
  181. | Load saved auth state | `state-load auth.json` |
  182. | See network requests | `network` |
  183. | Manage tabs | `tab-list`, `tab-new`, `tab-close`, `tab-select` |
  184. | Close the session | `close` |
  185. ## When CLI vs MCP (Auto Mode Decision)
  186. | Situation | Tool | Why |
  187. | ------------------------------------- | ---- | ---------------------------------- |
  188. | "What's on this page?" | CLI | One-shot snapshot, no state needed |
  189. | "Verify this selector exists" | CLI | Single check, minimal tokens |
  190. | "Capture a screenshot for evidence" | CLI | Stateless capture |
  191. | "Walk through a multi-step wizard" | MCP | State carries across steps |
  192. | "Debug why this test fails" (healing) | CLI | `--debug=cli` + trace analysis |
  193. | "Record a drag-and-drop flow" | MCP | Complex interaction semantics |
  194. ## Related Fragments
  195. - `overview.md` — Playwright Utils installation and fixture patterns (the test code layer that CLI complements)
  196. - `api-request.md` — Typed HTTP client for API tests (CLI discovers endpoints, apiRequest tests them)
  197. - `api-testing-patterns.md` — Pure API test patterns (when CLI isn't needed)
  198. - `auth-session.md` — Token management (CLI `state-save` informs auth-session usage)
  199. - `selector-resilience.md` — Robust selector strategies (CLI verifies them against real DOM)
  200. - `visual-debugging.md` — Trace viewer usage (CLI captures traces)