WorkflowBeginnerMay 4, 2026

Check whether your site is discoverable to ChatGPT, Claude, Perplexity, and Google

Paste your URL. The demo runs real checks against your homepage, robots.txt, sitemap.xml, and llms.txt, then hands you copy-ready prompts to verify whether each LLM channel can actually answer about your site.

Loading demo...

Time: 60 seconds for the auto report, 2 to 3 minutes if you also run the manual channel checks
Cost: $0
Stack: Next.js route handlerfetchrobots.txt parserJSON-LD detector

You’re stuck with

You shipped a site or changed its content. You have no idea whether AI search surfaces can crawl it, cite it, or send traffic.

You end up with

A readiness report covering automatic crawlability signals plus an assisted verification pass for each LLM channel. If something is broken or blocked, you get a specific next action.

The recipe

What the demo above actually does

Type a URL and the API hits your origin in parallel. Every result you see comes from a real network request, not a precomputed table.

Homepage: real GET against the URL. Status code, redirect chain, byte size of the returned HTML.
robots.txt: real fetch from /robots.txt. Parsed for User-agent groups, Disallow rules, and Sitemap directives. Each major bot or product token is resolved against named rules or User-agent: * fallback so you can see exactly which are allowed, blocked, partially blocked, or unspecified.
sitemap.xml: real fetch from the Sitemap line in robots.txt if present, otherwise from /sitemap.xml. Counts URL entries or detects a sitemap index.
llms.txt: real fetch from /llms.txt. Reports size and non-empty lines.
HTML signals: the homepage HTML is parsed for <title>, <meta name="description">, OpenGraph tags, JSON-LD blocks (with @type extraction), and <link rel="canonical">.

Everything above is automatic. The readiness score is computed from those signals. It is not an LLM index score.

Why ChatGPT, Claude, and Perplexity are manual

There is no public API where you can ask "does ChatGPT know about my site?" and get a real answer. So the demo refuses to fake it. For each LLM channel, you get:

a one-click open link to the channel
a copy-ready prompt that asks the model what it already knows about your domain (and to say so explicitly when it does not)
a status pulled from the real bot policies in your robots.txt

If your robots.txt blocks OAI-SearchBot, that is a concrete OpenAI search-surfacing warning before you even open ChatGPT. If it does not, you open the channel and verify the answer yourself.

Perplexity is special: it does live retrieval, so the open link is a prefilled query and you check whether your site appears as a cited source.

How to read the report

Score and grade. A weighted combination of the seven automatic checks plus a bonus for not blocking any major AI bot.
Channel matrix. Five rows: Google Search, ChatGPT, Claude, Perplexity, and Google AI product use. Each row tells you what the demo could verify automatically and what you need to verify yourself.
Signals. The seven raw checks with the exact evidence strings.
Gaps and next actions. If anything is missing or blocked, the demo lists the smallest specific change to make.

Try it on a few sites for calibration

Your own site, to see the baseline.
A large media site (nytimes.com, bbc.com) to see what mature configs look like, including which AI bots they explicitly block.
A small new site, to see where the gaps usually are (no llms.txt, no JSON-LD, no AI-bot policy).

What this workflow does not do

It does not run paid Google API queries. The Google Search row points at the public site: results and trusts your eyes.
It does not call ChatGPT, Claude, or Perplexity APIs. The corpus question is unanswerable through their APIs anyway, since the relevant capability is "what does the chat surface say without being given the URL."
It does not store your URL. The route validates against SSRF (no localhost, no private IPs), runs the checks, and returns the report.
It does not treat Google-Extended as a Google Search ranking or inclusion signal. Google Search visibility is still about Googlebot and normal Search systems.

Get new workflows and breakdowns in your inbox.

The recipe

What the demo above actually does

Type a URL and the API hits your origin in parallel. Every result you see comes from a real network request, not a precomputed table.

Homepage: real GET against the URL. Status code, redirect chain, byte size of the returned HTML.
robots.txt: real fetch from /robots.txt. Parsed for User-agent groups, Disallow rules, and Sitemap directives. Each major bot or product token is resolved against named rules or User-agent: * fallback so you can see exactly which are allowed, blocked, partially blocked, or unspecified.
sitemap.xml: real fetch from the Sitemap line in robots.txt if present, otherwise from /sitemap.xml. Counts URL entries or detects a sitemap index.
llms.txt: real fetch from /llms.txt. Reports size and non-empty lines.
HTML signals: the homepage HTML is parsed for <title>, <meta name="description">, OpenGraph tags, JSON-LD blocks (with @type extraction), and <link rel="canonical">.

Everything above is automatic. The readiness score is computed from those signals. It is not an LLM index score.

Why ChatGPT, Claude, and Perplexity are manual

There is no public API where you can ask "does ChatGPT know about my site?" and get a real answer. So the demo refuses to fake it. For each LLM channel, you get:

a one-click open link to the channel
a copy-ready prompt that asks the model what it already knows about your domain (and to say so explicitly when it does not)
a status pulled from the real bot policies in your robots.txt

If your robots.txt blocks OAI-SearchBot, that is a concrete OpenAI search-surfacing warning before you even open ChatGPT. If it does not, you open the channel and verify the answer yourself.

Perplexity is special: it does live retrieval, so the open link is a prefilled query and you check whether your site appears as a cited source.

How to read the report

Score and grade. A weighted combination of the seven automatic checks plus a bonus for not blocking any major AI bot.
Channel matrix. Five rows: Google Search, ChatGPT, Claude, Perplexity, and Google AI product use. Each row tells you what the demo could verify automatically and what you need to verify yourself.
Signals. The seven raw checks with the exact evidence strings.
Gaps and next actions. If anything is missing or blocked, the demo lists the smallest specific change to make.

Try it on a few sites for calibration

Your own site, to see the baseline.
A large media site (nytimes.com, bbc.com) to see what mature configs look like, including which AI bots they explicitly block.
A small new site, to see where the gaps usually are (no llms.txt, no JSON-LD, no AI-bot policy).

What this workflow does not do

It does not run paid Google API queries. The Google Search row points at the public site: results and trusts your eyes.
It does not call ChatGPT, Claude, or Perplexity APIs. The corpus question is unanswerable through their APIs anyway, since the relevant capability is "what does the chat surface say without being given the URL."
It does not store your URL. The route validates against SSRF (no localhost, no private IPs), runs the checks, and returns the report.
It does not treat Google-Extended as a Google Search ranking or inclusion signal. Google Search visibility is still about Googlebot and normal Search systems.

Check whether your site is discoverable to ChatGPT, Claude, Perplexity, and Google

The recipe

What the demo above actually does

Why ChatGPT, Claude, and Perplexity are manual

How to read the report

Try it on a few sites for calibration

What this workflow does not do

Check whether your site is discoverable to ChatGPT, Claude, Perplexity, and Google

Check any public site

The recipe

What the demo above actually does

Why ChatGPT, Claude, and Perplexity are manual

How to read the report

Try it on a few sites for calibration

What this workflow does not do

Check any public site