Crawl engine

Crawl your whole site in the cloud

Spider from a seed URL or feed a sitemap, render JavaScript with headless Chromium, and stream every result in real time — up to 500,000 URLs per crawl. No desktop install, no local memory ceiling.

Start crawling free All features

CrawlX

acme-outfitters.com · spider mode

live

Crawl

Live results

Pages 84k

Issues 1,204

Render diff

Engine

JS rendering

Schedules

crawling · 84,213 / 100k

URLs crawled

+ live

In queue

discovering

Avg response

0.46s

healthy

Issues found

38 critical

URLStatusIssueLoad

/products/trail-runner-gtx200—0.24s

/collections/new-arrivals200—0.31s

/account/orders (SPA)200JS-rendered0.71s

/blog/how-to-lace-boots200Thin content0.88s

/products/sku-2019-retired404Broken link1.10s

/products/canonical-dupe200Bad canonical0.42s

Two ways in, nothing missed

Spider the whole site, or crawl just your sitemap

In spider mode, CrawlX starts at a seed URL and follows links recursively across the site — surfacing orphan pages that no sitemap ever lists. In sitemap mode, it crawls exactly the URLs you publish. Either way it respects robots.txt and crawl-delay, so you stay a good citizen of your own infrastructure.

Spider from a seed Sitemap & sitemap-index Finds orphan pages Respects robots.txt Honors crawl-delay

Crawl configuration

● ready

Spider mode

Recursive from a seed URL

Sitemap mode

/sitemap_index.xml · 12,408 URLs

/robots.txt

Disallow: /cart → respected

Disallow: /search → respected

Crawl-delay: 1s → honored

JavaScript rendering & render diff

See what your users see — not just what ships in the HTML

CrawlX renders every page in headless Chromium, then compares the raw HTML you serve against the JS-rendered DOM. The render diff flags content, links, and canonicals that only appear after rendering — the exact gaps that cost SPAs their rankings.

Headless Chromium Catches SPA content Raw vs rendered DOM Late canonicals Hidden internal links

Render diff · /products/trail-runner-gtx

5 changes

Raw HTML

<title>Acme Outfitters</title>

links: 4

words: 0

JS-rendered DOM

<title>Trail Runner GTX…</title>

<h1>Trail Runner GTX</h1>

links: 22

words: 642

−<title>Acme Outfitters

+<title>Trail Runner GTX — Acme Outfitters

++ canonicalhttps://acme-outfitters.com/products/trail-runner-gtx

++ 18 internal linksrelated products, breadcrumb, reviews

++ product copy642 words injected by the storefront SPA

Scheduled crawl

active

Next run

Mon, Jun 8 · 06:00 UTC

Email alert on completion to the team

Cadence

DailyWeeklyMonthly

Last 3 runs

Jun 112,408 URLs1,204 issues

May 2512,390 URLs1,261 issues

May 1812,377 URLs1,318 issues

Scheduled crawls

Set it once, watch the trend line

Schedule a crawl to run daily, weekly, or monthly and CrawlX handles the rest in the cloud — no machine left running overnight. Every run lands an email alert on completion, so you catch regressions the morning they happen instead of the week you go looking.

Daily · weekly · monthly Email on completion Run history Cloud — always on Regression tracking

The crawl engine

Built to crawl anything

Six capabilities that let one cloud crawl stand in for a rack of desktop machines.

Spider mode

Recursively follows links from a seed URL to map your whole site — including orphan pages no sitemap ever lists.

Sitemap mode

Point CrawlX at an XML sitemap (or a sitemap index) and crawl exactly the URLs you publish, nothing more.

JavaScript rendering

A headless Chromium engine executes your JS and crawls the rendered DOM, so SPA and client-rendered content is never missed.

Scheduled crawls

Run daily, weekly, or monthly on autopilot and get an email the moment a crawl finishes — track regressions over time.

Real-time results

Watch URLs stream in one by one as they're crawled, with status, issues, and load time — no waiting for a batch to finish.

Crawl at scale

Up to 500,000 URLs per crawl on Agency. It runs in the cloud, so there's no desktop install and no local memory ceiling.

Honest comparison

Cloud crawling vs the desktop tools

Screaming Frog and Sitebulb are excellent — here's where running in the cloud wins, and where they still lead.

Capability	CrawlX	Screaming Frog	Sitebulb
Runs in the cloud (no install)	Cloud	Desktop	Desktop
Local memory ceiling on big sites	None	RAM-bound	RAM-bound
JavaScript rendering (headless Chromium)	Yes	Yes	Yes
Real-time streaming results	Yes	Partial	Partial
Render diff (raw vs rendered)	Yes	Manual	Manual
Scheduled crawls + email alerts	Built-in	Scheduling add-on	Yes
Visual crawl maps	On roadmap	No	Mature

Keep exploring

Explore more features

The crawl is step one. Here's what CrawlX does with everything it finds.

Point it at a site.
Watch it crawl.

No install, no memory ceiling. Your first cloud crawl runs free — 500 URLs, no credit card.

Start crawling free See pricing

Spider the whole site, or crawl just your sitemap

Spider from a seed Sitemap & sitemap-index Finds orphan pages Respects robots.txt Honors crawl-delay

See what your users see — not just what ships in the HTML

Headless Chromium Catches SPA content Raw vs rendered DOM Late canonicals Hidden internal links

Set it once, watch the trend line

Daily · weekly · monthly Email on completion Run history Cloud — always on Regression tracking

Capability

CrawlX

Screaming Frog

Sitebulb

Runs in the cloud (no install)

Cloud

Desktop

Local memory ceiling on big sites

None

RAM-bound

JavaScript rendering (headless Chromium)

Yes

Real-time streaming results

Yes

Partial

Render diff (raw vs rendered)

Yes

Manual

Scheduled crawls + email alerts

Built-in

Scheduling add-on

Yes

Visual crawl maps

On roadmap

Mature

Crawl your whole site in the cloud

Spider the whole site, or crawl just your sitemap

Crawl configuration

See what your users see — not just what ships in the HTML

Render diff · /products/trail-runner-gtx

Scheduled crawl

Set it once, watch the trend line

Built to crawl anything

Spider mode

Sitemap mode

JavaScript rendering

Scheduled crawls

Real-time results

Crawl at scale

Cloud crawling vs the desktop tools

Explore more features

Impact triage & 65+ checks

AI: fixes, content & schema

Technical-SEO toolkit

Integrations & API

Reports & collaboration

All features

Point it at a site.Watch it crawl.

Crawl your whole site in the cloud

Spider the whole site, or crawl just your sitemap

Crawl configuration

See what your users see — not just what ships in the HTML

Render diff · /products/trail-runner-gtx

Scheduled crawl

Set it once, watch the trend line

Built to crawl anything

Spider mode

Sitemap mode

JavaScript rendering

Scheduled crawls

Real-time results

Crawl at scale

Cloud crawling vs the desktop tools

Explore more features

Impact triage & 65+ checks

AI: fixes, content & schema

Technical-SEO toolkit

Integrations & API

Reports & collaboration

All features

Point it at a site.Watch it crawl.

Point it at a site.
Watch it crawl.

Point it at a site.
Watch it crawl.