The 60-Second Test: Is Your Site Machine-Readable for AI?

Before you spend a euro on answer-engine optimization, run one test. It takes about a minute and it answers the only question that comes before all the others: when an AI crawler asks for your page, does it receive your content — or an empty shell? If it's the shell, every later tactic (schema, answer-first copy, llms.txt) is wasted, because the bots feeding ChatGPT and Claude don't run JavaScript and will never assemble the page a human's browser does. Here's how to check.

The 60-second test

You want to see the raw HTML your server sends, before any JavaScript runs — the same thing a crawler sees. Three ways, easiest first:

View source (no tools). Open your page, right-click, choose View Page Source (not "Inspect" — Inspect shows the rendered DOM, which lies to you here). Press Ctrl-F and search for a full sentence from your body copy. Is it there?
curl (one command). Run curl -s https://yoursite.com/your-page | grep -o "a distinctive sentence from your page". If it prints the sentence, the text is in the response. If it prints nothing, it isn't.
Disable JavaScript. In your browser's settings turn JavaScript off, then reload the page. What's left on screen is roughly what a non-rendering crawler gets. A blank or skeletal page is the answer.

Use a real sentence of prose, not a heading or a nav label — those sometimes ship in the shell even when the article body doesn't.

Reading the result

Two outcomes:

Your paragraph text is in the raw HTML. Good. Your page is server-rendered (SSR) or statically generated (SSG), and a crawler can read it. Move on to structure and content quality below.
You see an empty <div id="root">, a wall of script tags, and none of your copy. Your site is client-rendered. To a renderer-less crawler the page is blank — invisible to the exact systems sending AI traffic. This is the rendering wall, and the fix is server-rendering, which is a property of your CMS, not a patch you write per page.

One nuance: classic Google may still rank a client-rendered page fine, because Googlebot renders JavaScript. That's what hides the problem — the page looks healthy in Search Console while being unreadable to GPTBot, ClaudeBot, and PerplexityBot.

Beyond the rendering test

Passing the test is the floor, not the ceiling. Once the crawler can read the page, what gets you cited is a second layer:

Structure. Real headings, lists, and schema markup so the facts are extractable, not buried in one undifferentiated block.
Answer-first copy. Lead with the direct answer to the question the page targets; engines lift the clean statement, not the wind-up.
Citable substance. The Princeton GEO study found statistics, quotations, and citations lift generative-engine visibility by 30–40%. Vague pages don't get quoted.
Stability. Fixed URLs and consistent facts; citation engines churn 40–60% of their sources monthly and favour the stable target.

What you'll notice is that none of these is the file everyone wants to talk about — llms.txt, which is not enough on its own.

The short version

Run the test before you optimize anything. View source, or curl the URL, or kill JavaScript — then look for a real sentence of your body copy in the raw HTML. There, and you're readable; missing, and you're a blank page to every crawler but Google's. The rendering test is the floor; structure, answer-first copy, citable facts, and stable URLs are how you turn readable into cited.

If you failed the test, read why AI crawlers can't read your site and why your CMS is the bottleneck. agntcms serves real server-rendered HTML by default and is open source on GitHub.

github.com/agntcms