Browser Task Replay
Step through a bundled synthetic agent trace: click, wait, assertion, and the row where stated intent did not match what the agent actually did.
What this proves
A failing agent run is a logbook, not a verdict. The replay turns a synthetic trace into a row you can fix on Monday.
How it works
What This Proves
Chrome's I/O 2026 agent and DevTools direction is interesting because it points toward browser work that can be inspected, replayed, and reviewed. Most failure reports only show the last screenshot. The trace shows the row where the agent went wrong.
This replay is the smallest possible version of that. Load one of three synthetic traces, step through the actions, and read the intent vs action diff inline.
What This Build Does
- Loads a bundled synthetic trace (search-flow, checkout-flow, settings-flow)
- Shows each step with selector, action, wait, and assertion
- Flags the rows where the agent's stated intent did not match what it did
Safety Boundaries
- Synthetic traces only. Traces are hand-written sample fixtures. No real session recording, no cookies, no DOM dumps.
- No live replay. The stepper never opens a real browser or hits a real URL.
- No payment surface. Checkout-flow sample uses placeholder strings; nothing talks to a payments API.
What Would Come Next
- Add a per-step screenshot column once a synthetic capture pipeline lands
- Group failures by selector drift vs intent drift vs timing drift
- Export a one-row regression note that pastes into a bug tracker
Get new builds, breakdowns, and useful AI updates.