Make your documentation readable by AI agents

AI agents (Claude Code, ChatGPT, Cursor, Copilot) are a primary consumer of developer documentation. They don't need navigation chrome, dark mode toggles, or animated code blocks. They need:

Discoverable content: Where the docs are and what they cover
Clean retrieval: Markdown, not a DOM tree
Structured metadata: Version, last updated, and canonical URL
Tool access: Search and fetch via protocol, not scraping

Most content platforms serve agents the same HTML page they serve humans. The agent then spends tokens stripping tags, guessing at content boundaries, and hoping the important information survives extraction. The result: hallucinated APIs, outdated code examples, and missed context.

Three layers of agent readiness

Agents interact with content in three layers. They discover what exists, retrieve clean content, optionally index at scale, and use tools for precision queries.

Layer 1: Discovery

Help agents find what exists before they fetch anything.

Requirement	Implementation	Purpose
`/llms.txt`	Curated markdown index at the site root with section headings and links	Entry point for agents and humans pasting docs into an IDE
`sitemap.xml`	Standard sitemap with accurate `lastmod` dates	Freshness tracking for agents that monitor content changes
`sitemap.md`	Semantic sitemap served as markdown with section headings, categories, and page descriptions	High-level orientation for LLM-assisted navigation and contributor onboarding
`robots.txt`	Documented stance on agent access (which bots can crawl, which pages are off-limits)	Crawl control and access transparency
JSON-LD structured data	Title, description, canonical URL, and breadcrumbs on every HTML page	Agents that parse HTML can understand page type and relationships without traversing the DOM

Example JSON-LD for a docs page:

{
  "@context": "<https://schema.org>",
  "@type": "TechArticle",
  "headline": "Vercel Functions",
  "description": "Deploy server-side code on Vercel.",
  "url": "<https://vercel.com/docs/functions>",
  "breadcrumb": {
    "@type": "BreadcrumbList",
    "itemListElement": [
      {
        "@type": "ListItem",
        "position": 1,
        "name": "Docs",
        "item": "<https://vercel.com/docs>"
      },
      {
        "@type": "ListItem",
        "position": 2,
        "name": "Functions",
        "item": "<https://vercel.com/docs/functions>"
      }
    ]
  }
}

Layer 2: Retrieval

This is the highest-impact layer. When an agent fetches a docs page, it should get markdown, not HTML with embedded scripts.

Mechanism	How it works
Content negotiation	Agents that send `Accept: text/markdown` receive markdown with `Content-Type: text/markdown; charset=utf-8` and a `Vary: Accept` header. Claude Code sends this header natively.
Agent auto-rewrite	Detected AI agents receive markdown automatically, even without an `Accept: text/markdown` header. Detection uses user-agent matching, RFC 9421 Signature-Agent headers, and a heuristic fallback for unknown agents.
`.md` endpoints	Append `.md` to any docs URL to get the markdown version directly. Works without custom headers. Useful for pasting docs into a chat or IDE.
Rich frontmatter	Every markdown response includes metadata for accurate citations.
HTML alternate link	Adding `<link rel="alternate" type="text/markdown">` in the HTML `<head>` tells agents a markdown version exists.

Example request and response:

Terminal

curl -H "Accept: text/markdown" <https://vercel.com/docs/functions>

/docs/functions.md

---
title: Functions
description: Deploy server-side code on Vercel.
canonical_url: <https://vercel.com/docs/functions>
md_url: <https://vercel.com/docs/functions.md>
last_updated: 2026-01-15T12:34:56.000Z
---

# Functions

Deploy server-side code on Vercel...

Without canonical_url and last_updated, agents can't link back to the source or judge whether their information is stale.

The Vary: Accept header tells CDNs to cache HTML and markdown responses separately for the same URL. Without it, an agent's markdown response could be served to a browser, or vice versa.

Always serve markdown to agents

Content negotiation and .md endpoints cover agents that explicitly request markdown. Many agents don't. They send a standard GET with no special headers. If your platform can detect that the request comes from an AI agent, serve markdown anyway.

Vercel uses a three-layer detection approach:

User-agent matching: Check against a maintained list of known AI agent strings (Claude, ChatGPT, GPTBot, Cursor, Copilot, and others). This is the most reliable signal.
Signature-Agent header: The RFC 9421 standard header, used by ChatGPT's agent. Validate against known AI service domains.
Heuristic fallback: If the request is missing the sec-fetch-mode header (which real browsers always send) and the user-agent matches a bot-like pattern, treat it as an agent. This catches unknown agents at the cost of occasional false positives, which are low-harm since serving markdown to a non-AI bot has no negative effect.

This means agents get markdown on every valid docs request, regardless of how they make it.

Handle 404s with markdown, not HTML

When an agent requests a page that doesn't exist, an HTML 404 page is useless. The agent can't parse it, and the conversation stalls. Instead, return a markdown response with actionable content:

Search your docs index for pages similar to the requested path.
If a result scores above a high-confidence threshold (0.99+), redirect the agent to the correct page.
Otherwise, return markdown-formatted suggestions listing the closest matches as links.
Return a 200 status, not 404. Agents need content they can act on.
Append a sitemap footer to every markdown response so agents can browse the full index if they hit a dead end.

Layer 3: Tool access

The most sophisticated agents don't scrape at all. They call tools.

Tool	Description
MCP servers	Search docs, fetch specific pages, and list available content through a standard protocol. The Vercel MCP server covers `vercel.com/docs`, `nextjs.org`, and the AI SDK docs.
Search APIs	Return structured JSON results with canonical URLs, snippets, and freshness metadata.
AI Chat	Conversational interface on docs sites backed by doc-aware tools: `search_docs`, `get_doc_page`, and `list_docs`. Available to both humans and agents.

Checklist

For the full scoring rubric (0–100) and detailed verification steps, see the Agent-Readability Spec.

Discovery

Serve /llms.txt with a curated H1 + H2 index of your content
Publish sitemap.xml with accurate lastmod dates
Serve /sitemap.md with a semantic, markdown-formatted sitemap describing docs sections and pages
Document agent access policy in robots.txt
Add JSON-LD structured data (title, description, canonical URL, and breadcrumbs) to every page

Retrieval

Return markdown for Accept: text/markdown with a Vary: Accept header
Generate .md endpoints for all content pages
Include frontmatter metadata (title, canonical_url, last_updated) in every markdown response
Add <link rel="alternate" type="text/markdown"> to HTML pages
Detect AI agents (user-agent matching, Signature-Agent header, heuristic fallback) and serve markdown automatically
Verify by appending .md to any page URL and confirming you get clean markdown with frontmatter

Tool access

Expose search via MCP server or search API
Add a SKILLS.md or AGENTS.md file with install, config, and usage instructions for coding agents

To get started, run through the checklist above, then score your site against the Agent-Readability Spec.

Build with AI

Deploy & scale

Operate & protect

Resources