How AI is changing SEO: lessons from a billion crawler requests

9 min read

Most AI crawlers don't execute JavaScript. We tested the major ones (ChatGPT, Claude, and others), and the results were consistent: none of them render client-side content. If your Next.js site ships critical pages as JavaScript-dependent SPAs, those pages are inaccessible to the systems shaping how people discover information.

The dominant advice focuses on content structure: clearer answers, schema markup, FAQ blocks. That advice isn't wrong. It's just incomplete. The rise of AI-powered search interfaces means that technical delivery matters as much as content quality.

TL;DR: Major AI crawlers (including ChatGPT and Claude) don't execute JavaScript. If your Next.js site renders content client-side, it's mostly invisible in AI search regardless of how high-quality your content is. Server rendering plus structured, extractable content is the baseline for showing up in AI-generated answers.

Link to headingWhy search changed

AI-first interfaces like ChatGPT and Google's AI Overviews now answer questions before users click a link, moving beyond the traditional model of ranking in a list of links that send organic traffic to your site. The shift from traditional search results ("ten blue links") to answers served directly on search engine results pages (SERPs) has decoupled impressions from clicks, especially for informational queries.

Google Search itself now incorporates generative AI to produce summaries at the top of results, often satisfying the user's need without a single outbound click. This pattern extends across platforms: users issue search queries or prompts and receive synthesized answers rather than a curated list of links to visit.

The fundamentals still matter: high-quality content, good site architecture, backlinks, and technical performance. The technical delivery method—how content is rendered—now matters as much as content quality. Understanding how AI is changing SEO means recognizing both the fundamentals and the new technical requirements that determine whether AI systems can access your content in the first place.

Link to headingFrom ranking to citation

Traditional SEO competed for position. The new game is becoming the source the model extracts from. When ChatGPT answers a question about deployment strategies or caching patterns, the web pages it cites shape the answer. Sites that are rendered client-side aren't part of that conversation.

Search visibility now depends on whether AI systems can access, parse, and attribute your content. That requires two things working together:

  • Crawlable content so AI systems can fetch and read your pages

  • Extractable structure so the content is organized for models to interpret and cite it

Most of the current discourse emphasizes the second requirement. But the first is where many teams unknowingly fail.

Link to headingWhat LLMs actually reward

LLMs interpret meaning, which changes which content signals matter. These AI models are trained on massive text datasets, enabling them to understand semantic relationships between concepts rather than just matching keywords or counting term frequency.

LLMs favor content that explains things clearly, deeply, and with structure. When you query Claude about caching strategies, a single comprehensive guide outperforms five thin pages targeting the same keyword.

Link to headingTraditional SEO vs. LLM optimization

Traditional SEO tactics like keyword research and link building still matter, but they're not enough. AI models prioritize semantic depth and clear structure over topical relevance and domain authority.

The problem: even perfectly structured content is invisible if AI crawlers can't access it. Most teams focus on content optimization while missing the technical rendering requirement that determines whether models see their content at all.

Link to headingWhy AI crawlers can't render JavaScript

We tested the major AI crawlers. None render JavaScript. ChatGPT and Claude are explicitly verified: they don't execute JavaScript, so any content that loads via client-side JS is inaccessible to them.

Googlebot renders JavaScript (eventually), but AI crawlers operate differently. The assumption that "if Google can index it, AI search can too" is false.

Link to headingWhat crawlers actually see

When an AI crawler fetches a client-rendered page, it receives the initial HTML response: typically a shell with a <div id="root"></div> and script tags. Without JavaScript execution, that's all it ever sees. Your carefully crafted content, loaded dynamically after hydration, never reaches the crawler.

Here's a typical client-rendered React component:

// Content below is inaccessible to AI crawlers
export default function ProductPage() {
const [product, setProduct] = useState(null);
useEffect(() => {
fetch('/api/product')
.then(res => res.json())
.then(setProduct);
}, []);
if (!product) return <div>Loading...</div>;
return <article>{product.description}</article>;
}

The crawler sees Loading... and nothing else. Your product description (the content you want cited) is hidden behind a JavaScript execution barrier the crawler won't cross.

Link to headingWhat this looks like in practice

When ChatGPT's crawler fetches a client-rendered page, here's the HTML it receives:

<!-- What the AI crawler sees -->
<!DOCTYPE html>
<html>
<head><title>Product Page</title></head>
<body>
<div id="root">Loading...</div>
<script src="/bundle.js"></script>
</body>
</html>

Compare that to a server-rendered page:

<!-- What the AI crawler sees -->
<!DOCTYPE html>
<html>
<head><title>Product Page</title></head>
<body>
<article>
<h1>Edge Caching for Dynamic Content</h1>
<p>ISR regenerates static pages after a specified interval
without requiring a full rebuild...</p>
</article>
</body>
</html>

The difference determines whether your content can be surfaced in answer engines.

Link to headingServer rendering for AI visibility

This constraint leads to a clear action: use Server-Side Rendering (SSR), Static Site Generation (SSG), or Incremental Static Regeneration (ISR) to expose static HTML. Because AI crawlers fetch but do not execute JavaScript, these rendering strategies become visibility prerequisites.

Server rendering also benefits user experience directly: faster first paints, reduced layout shift, and better accessibility for users on slow connections or older devices.

Link to headingWhen to use each strategy

Each rendering approach below delivers static HTML on first request. The difference is when that HTML gets generated:

  • SSG (Static Site Generation): Best for content that rarely changes. Pages are built at deploy time. Use for documentation, marketing pages, and blog posts.

  • ISR (Incremental Static Regeneration): Best for content that updates periodically but doesn't need real-time accuracy. Pages are statically generated but revalidated on a schedule. Use for product catalogs, pricing pages, and content that changes daily or weekly.

  • SSR (Server-Side Rendering): Best for highly dynamic or personalized content. Pages render on each request. Use for user-specific dashboards, real-time data displays, and pages where staleness is unacceptable.

Next.js supports all three natively. The right rendering choice depends on your content's update frequency and accuracy requirements, not on AI crawler needs specifically. But making the wrong choice (defaulting to client-side rendering for critical content) removes you from AI-driven discovery entirely.

Link to headingImplementation pattern

Here's how a product page looks when server-rendered in Next.js:

// Content below is accessible to AI crawlers
export async function getStaticProps({ params }) {
const product = await fetchProduct(params.id);
return {
props: { product },
revalidate: 3600 // ISR: regenerate hourly
};
}
export default function ProductPage({ product }) {
return (
<article>
<h1>{product.name}</h1>
<p>{product.description}</p>
</article>
);
}

The crawler receives the full HTML with product content included. No JavaScript execution required.

Link to headingStructure content for extraction

Server rendering makes content accessible to AI crawlers. The next requirement is making it extractable. Content needs structure so models can parse and cite it. AI systems parse your content to find answers, attribute claims, and construct responses, so how you structure that content directly affects whether it gets cited.

Link to headingFormatting for LLM extraction

According to a recent post from AirOps, these patterns help models parse and cite your content:

Heading hierarchy and semantic clarity: Use one H1, then nest H2s, H3s, and H4s logically. Write headings as standalone questions that reflect real user queries. AirOps found that pages with clean heading hierarchy and aligned schema earned 2.8× higher AI citation rates than poorly structured pages.

Answer-first placement: Open every section with a sentence that directly answers the heading, then use the rest of the section to explain or support that answer. Answer engines prioritize content that makes answers immediately extractable rather than requiring them to hunt through paragraphs.

Short, focused paragraphs: Keep paragraphs to 2-4 sentences. This reduces ambiguity during extraction and makes it easier for answer engines to summarize accurately without partial reuse or misquotation. Dense or meandering paragraphs decrease citation reliability.

One idea per section, one idea per paragraph: Treat each section as a standalone answer block that can be lifted and reused without surrounding context. Clear visual separation between concepts improves extractability.

Consistent terminology: Pick one term per concept and stick to it. Don't rotate through synonyms. Consistency helps answer engines associate your page with a specific concept and improves entity recognition.

FAQ formatting: Format FAQ questions as H3s with answers immediately following. Each answer should stand alone without relying on surrounding context. FAQs align naturally with how users phrase questions in AI Search.

Lists and tables for structure, not decoration: Use lists and tables when structure improves clarity or comparison—especially when constraints, options, or attributes need to remain intact during extraction. Avoid decorative formatting.

Schema markup as a supporting signal: Pages using three or more relevant schema types showed ~13% higher likelihood of being cited in the AirOps research. Schema clarifies intent but can't compensate for vague language or unclear answers—answer engines evaluate visible content first.

E-E-A-T signals remain important, but AirOps' data shows that structural clarity is the primary determinant of whether content gets extracted and cited. Even strong expertise goes uncited if the surrounding structure makes intent hard to infer.

Link to headingTwo paths to AI visibility: training vs. real-time retrieval

AI discovery operates through two distinct mechanisms:

Path 1: Training data (historical, infrequent)

  • Content is crawled during training data collection

  • Selected content enters foundation model training datasets

  • Influences the model's base knowledge and capabilities

  • Does not produce citations or referrals

Path 2: Retrieval and citation (real-time, ongoing)

  • Content is crawled and indexed for real-time search

  • Answer engines retrieve relevant pages when generating responses

  • Retrieved content is cited as a source

  • Citations drive referrals to your site

Most AI-generated answers for complex prompts rely on real-time retrieval (Path 2). When ChatGPT, Perplexity, or Google's AI Overviews cite your content, they're pulling it from search indexes, not from what the model learned during training.

Your content moves through these stages for retrieval-based citations:

  1. Crawl: Answer engines and search systems can access your content

  2. Index: Content enters retrieval databases with structural signals intact

  3. Retrieve: Your page matches relevant queries and gets pulled into context

  4. Cite: The model references your content when generating answers

  5. Refer: Users click through citations to your site

Training data matters for foundational knowledge, but retrieval determines citations. Optimizing for AI discovery means ensuring your content is crawlable, structurally clear, and authoritative enough to be selected during real-time retrieval.

Link to headingOptimizing Next.js for AI search

Combining server rendering with structured content is the baseline for showing up in AI-generated answers. These SEO strategies address the technical delivery layer that most content-only guidance overlooks. Here's the practical checklist:

Link to headingAudit rendering strategy

The rendering strategy you choose determines whether AI systems can access your content at all. Here's how to audit and adjust:

  • Use getStaticProps for stable content (documentation, marketing pages)

  • Use getStaticProps with revalidate for periodically updated content (product pages, blog posts)

  • Use getServerSideProps for dynamic content that must be current on every request

Identify which pages render client-side and migrate critical content to SSR, SSG, or ISR.

Link to headingStructure for extraction

Content structure affects whether models can parse and cite your claims. Review formatting on high-value pages:

  • Replace <div> wrappers with semantic elements where appropriate

  • Ensure heading hierarchy is clean (one <h1>, logical <h2>/<h3> structure)

  • Lead sections with direct answers before elaboration

  • Use lists for parallel items, tables for comparisons

Link to headingReview bot policies

Bot policies determine which AI systems can access your content. Examine your robots.txt and CDN rules:

  • Decide which AI systems you want visibility in

  • Use specific user-agent rules rather than blanket blocks

  • Monitor server logs for AI crawler traffic to understand current access patterns

Link to headingMonitor for AI referrers

Measuring visibility in AI systems is still an evolving challenge. There's no reliable dashboard showing if your content appears in answers or is embedded in training data for all major models. But you can track proxy metrics:

  • AI-specific referrers in analytics (chatgpt.com, claude.ai)

  • Mention tracking for brand and product names in AI-generated content

  • Citation appearances when querying AI systems directly about your domain

AI tools like Profound and Peec can supplement this monitoring by tracking brand mentions and citations in target prompts. While not perfect, these tools take a data-driven approach to AI search monitoring.

Link to headingThe two-layer model

How AI is changing SEO means recognizing both layers:

Layer

Traditional SEO

AI/LLM adaptation

Content

Quality, depth, authority

Extractable structure, answer-first formatting

Delivery

Crawlable URLs, fast performance

Server-rendered HTML, bot access policies

Signals

Backlinks, engagement, keywords

Semantic clarity, schema markup, citation eligibility

Discovery

Organic search via Google results

AI-powered answers via ChatGPT, Perplexity, and Claude

Link to headingThe path forward for AI search engine optimization

The shift to AI-mediated discovery changes how content gets found, but the fundamentals remain consistent: make your content accessible and clearly structured. Here's what matters:

  • Server-render your content so AI systems can access it without executing JavaScript

  • Structure for extraction using clear headings, answer-first paragraphs, and semantic HTML

  • Set explicit bot policies that align with your visibility goals

  • Monitor AI referrers to understand which systems are citing your content

AI discovery isn't replacing traditional SEO or technical SEO—it's adding a new retrieval layer. Sites that get cited are those that make their content machine-readable and extractable. The technical foundation you build now determines whether your content surfaces in AI-generated answers going forward.