JavaScript required

The WcodeW viewer renders pixel diffs and slider interactions in the browser, so JS has to be on. The data itself is available without JS at wclone-export.csv, wclone-seeds.json, and wclone-diff-index.json.

Web → code → Web

We let an agent inspect a captured real website, write a reconstruction, then score the result against the original across visual fidelity, DOM/content structure, and interaction behavior. This site is the static viewer for the current realistic benchmark.

original agent clone drag to compare

Task Browser

All realistic benchmark tasks, interaction scoring cases, submissions, and score breakdowns.

browse →

Compare

Slider view over every captured/scored interaction step, with reference vs agent screenshots.

open →

Diff View

Step through the visual mismatch overlay for reconstructed website states.

open →

Data Index

Static JSON index containing task metadata, interactions, submissions, and score summaries.

open →

How it works

Website interaction — each task packages a captured real website plus replay states and private interaction/scoring cases under wclone-realistic-benchmark/.
Code reconstruction — the agent uses the available website evidence to write a new HTML reconstruction rather than copying the server artifact verbatim.
Scoring — the scorer compares reference and agent states across visual similarity, DOM/content fidelity, and explicit interaction outcomes. The viewer exposes the task, interaction cases, submissions, score breakdowns, and per-step visual comparisons.

Web → code → Web

Best & worst realistic reconstructions

Checkpoint diff grid current wclone sample x 2 viewports x 3 steps · click any cell to compare

Task Browser

Compare

Diff View

Data Index

How it works