Web → code → Web
We seal a public webpage, ask an LLM agent to clone it from a static spec (no live URL access), and compare what comes back, pixel-for-pixel, to the original. This site is the side-by-side viewer for that loop.
Best & worst clones
—
All 60 cells 10 bundles × 2 viewports × 3 steps · click any cell to compare
< 20%
20 – 45%
≥ 45%
sealed bundles
—
with agent runs
—
mean overall
—
categories
—
Gallery
Every (bundle × agent) pair as a hover-scrub thumbnail with diff%, sortable.
browse →Compare
Full slider — iframe / screenshot / diff / code modes. Step through interactions.
open →Matrix
One bundle, all 6 viewport×step combinations side-by-side at once.
open →v2 viewer
Every sealed bundle with seal status + reference clone scores.
browse →How it works
-
Web — we capture a real page with Playwright at desktop +
mobile viewports and three scroll positions. The DOM, accessibility tree,
screenshots, and every network response get pinned to disk under
wclone/<category>/<id>/sealed/. -
Code — an LLM agent reads only the public spec
(
agent_input/) and writes a single self-containedindex.htmlthat should look identical when rendered. - Web — we render the agent's HTML in the same captures and score it: 50 % visual SSIM, 30 % DOM similarity, 5 % interaction success, 15 % LLM-as-judge. The viewer here surfaces those numbers and lets you scrub the slider yourself to see where the clone drifts.