Visual Regression compares your live page against a previously approved screenshot (the baseline) and reports any differences. Score 100 = nothing changed. Lower scores = real differences detected.

You can run regression tests on demand, or set up a monitor that runs them automatically on a schedule and emails you when something changes.

One-off regression test

Open qaproof.io/app/tests (or, in WP admin, QAProof → Tests) → Visual Regression.
Enter the page URL.
Click Run Test. On the first run for this URL, the system captures a baseline. On every subsequent run, it compares against that baseline.

For automated long-term monitoring, use a Monitor instead — it does the same thing on a schedule.

Setting up a monitor

Open qaproof.io/app/monitors (or, in WP admin, QAProof → Monitors) → Add Monitor.
Enter the page URL, schedule (daily / weekly / monthly), notification email, and optional alert threshold (default: notify if score drops below 90).
Click Create. The first run captures the baseline. Subsequent scheduled runs are regression tests against it.

You can run a monitor on demand at any time with the Run button without waiting for the schedule.

What the score means

100 — no detectable differences. The page is identical to the baseline.
90–99 — trivial noise (sub-pixel rendering, anti-aliasing). Usually nothing actionable.
75–89 — minor confirmed changes (a word added, one button color tweaked).
50–74 — multiple changes or one significant layout/style regression.
25–49 — major regression affecting key sections.
<25 — page fundamentally broken or blank.

How the diff works

Three independent signals determine whether the page changed:

DOM text diff — every visible string on the baseline vs current. Catches text-only edits the screenshot diff might miss.
CSS color diff — extracted color tokens (--primary, --accent, button bg, link color, etc.) compared numerically. Catches palette shifts.
Pixel color diff — tile-based RGB comparison of the screenshots. Catches everything else.

If all three signals report no change, the system returns score 100 without invoking AI — fast and free of fake-positive risk. If any signal fires, AI analysis runs to interpret the change and produce a human-readable summary.

Approving a real change

When you intentionally redesign a page, the next monitor run will report a regression. That's expected — the system has no way to know which changes you meant. Open the result, verify the changes are correct, and click Approve. This captures a fresh baseline so future runs use the new state as the reference.

"Capture appears unstable" results

Sometimes you'll see a regression result with score 100 and a summary like "Capture appears unstable — page height differs by 60% between baseline and current" or "… 4 image(s) failed to load during the current capture".

This is a deliberate guardrail. The system detected that the current capture is itself unreliable (a JS carousel exploded, image CDN rate-limited us, etc.) and refused to compare. Reporting a regression in that state would be misleading — the underlying site didn't change. Re-run the test in 1–2 minutes; it almost always succeeds on retry.

Email alerts

When a monitor result drops below your alert threshold, the plugin emails the address you configured. The email includes:

Score and label (Good / Needs Work / Poor).
Score breakdown (4 categories).
The top issue found.
Download Full Report button — single click downloads a PDF with all issues, recommendations, and screenshots.

You can disable monitor alerts globally in Profile → Email Notifications, or per-monitor in the Monitor edit dialog.

Result history

Each monitor keeps the last 50 results. Older results auto-purge. Open a monitor in QAProof → Monitors to see the timeline — green dots are passing runs, amber/red are failures, and you can drill into any past result.

Tips for stable monitors

Pick stable pages. Pages with frequently rotating content (live stock prices, social media feeds) will report constant "regressions" because the content really does change. Use the dynamic-content masking (default on) to hide timers and rotators, or pick less volatile pages.
Use weekly schedule for low-traffic sites. WP-Cron firing on low-traffic WordPress installs can be unreliable. Either set up a server-level cron job hitting wp-cron.php, or use a weekly schedule which is more forgiving.
Recapture baselines after deploys. If you ship visual changes, hit Approve on the next regression result to refresh the baseline. Otherwise every future run will keep flagging the same intended change.