AI agents (Claude Code, ChatGPT, Cursor, Copilot) are a primary consumer of developer documentation. They don't need navigation chrome, dark mode toggles, or animated code blocks. They need:
- Discoverable content: Where the docs are and what they cover
- Clean retrieval: Markdown, not a DOM tree
- Structured metadata: Version, last updated, and canonical URL
- Tool access: Search and fetch via protocol, not scraping
Most content platforms serve agents the same HTML page they serve humans. The agent then spends tokens stripping tags, guessing at content boundaries, and hoping the important information survives extraction. The result: hallucinated APIs, outdated code examples, and missed context.
Agents interact with content in three layers. They discover what exists, retrieve clean content, optionally index at scale, and use tools for precision queries.
Help agents find what exists before they fetch anything.
| Requirement | Implementation | Purpose |
|---|---|---|
/llms.txt | Curated markdown index at the site root with section headings and links | Entry point for agents and humans pasting docs into an IDE |
sitemap.xml | Standard sitemap with accurate lastmod dates | Freshness tracking for agents that monitor content changes |
sitemap.md | Semantic sitemap served as markdown with section headings, categories, and page descriptions | High-level orientation for LLM-assisted navigation and contributor onboarding |
robots.txt | Documented stance on agent access (which bots can crawl, which pages are off-limits) | Crawl control and access transparency |
| JSON-LD structured data | Title, description, canonical URL, and breadcrumbs on every HTML page | Agents that parse HTML can understand page type and relationships without traversing the DOM |
Example JSON-LD for a docs page:
This is the highest-impact layer. When an agent fetches a docs page, it should get markdown, not HTML with embedded scripts.
| Mechanism | How it works |
|---|---|
| Content negotiation | Agents that send Accept: text/markdown receive markdown with Content-Type: text/markdown; charset=utf-8 and a Vary: Accept header. Claude Code sends this header natively. |
| Agent auto-rewrite | Detected AI agents receive markdown automatically, even without an Accept: text/markdown header. Detection uses user-agent matching, RFC 9421 Signature-Agent headers, and a heuristic fallback for unknown agents. |
.md endpoints | Append .md to any docs URL to get the markdown version directly. Works without custom headers. Useful for pasting docs into a chat or IDE. |
| Rich frontmatter | Every markdown response includes metadata for accurate citations. |
| HTML alternate link | Adding <link rel="alternate" type="text/markdown"> in the HTML <head> tells agents a markdown version exists. |
Example request and response:
Without canonical_url and last_updated, agents can't link back to the source or judge whether their information is stale.
The Vary: Accept header tells CDNs to cache HTML and markdown responses separately for the same URL. Without it, an agent's markdown response could be served to a browser, or vice versa.
Content negotiation and .md endpoints cover agents that explicitly request markdown. Many agents don't. They send a standard GET with no special headers. If your platform can detect that the request comes from an AI agent, serve markdown anyway.
Vercel uses a three-layer detection approach:
- User-agent matching: Check against a maintained list of known AI agent strings (Claude, ChatGPT, GPTBot, Cursor, Copilot, and others). This is the most reliable signal.
- Signature-Agent header: The RFC 9421 standard header, used by ChatGPT's agent. Validate against known AI service domains.
- Heuristic fallback: If the request is missing the
sec-fetch-modeheader (which real browsers always send) and the user-agent matches a bot-like pattern, treat it as an agent. This catches unknown agents at the cost of occasional false positives, which are low-harm since serving markdown to a non-AI bot has no negative effect.
This means agents get markdown on every valid docs request, regardless of how they make it.
When an agent requests a page that doesn't exist, an HTML 404 page is useless. The agent can't parse it, and the conversation stalls. Instead, return a markdown response with actionable content:
- Search your docs index for pages similar to the requested path.
- If a result scores above a high-confidence threshold (0.99+), redirect the agent to the correct page.
- Otherwise, return markdown-formatted suggestions listing the closest matches as links.
- Return a 200 status, not 404. Agents need content they can act on.
- Append a sitemap footer to every markdown response so agents can browse the full index if they hit a dead end.
The most sophisticated agents don't scrape at all. They call tools.
| Tool | Description |
|---|---|
| MCP servers | Search docs, fetch specific pages, and list available content through a standard protocol. The Vercel MCP server covers vercel.com/docs, nextjs.org, and the AI SDK docs. |
| Search APIs | Return structured JSON results with canonical URLs, snippets, and freshness metadata. |
| AI Chat | Conversational interface on docs sites backed by doc-aware tools: search_docs, get_doc_page, and list_docs. Available to both humans and agents. |
For the full scoring rubric (0–100) and detailed verification steps, see the Agent-Readability Spec.
Discovery
- Serve
/llms.txtwith a curated H1 + H2 index of your content - Publish
sitemap.xmlwith accuratelastmoddates - Serve
/sitemap.mdwith a semantic, markdown-formatted sitemap describing docs sections and pages - Document agent access policy in
robots.txt - Add JSON-LD structured data (title, description, canonical URL, and breadcrumbs) to every page
Retrieval
- Return markdown for
Accept: text/markdownwith aVary: Acceptheader - Generate
.mdendpoints for all content pages - Include frontmatter metadata (
title,canonical_url,last_updated) in every markdown response - Add
<link rel="alternate" type="text/markdown">to HTML pages - Detect AI agents (user-agent matching, Signature-Agent header, heuristic fallback) and serve markdown automatically
- Verify by appending
.mdto any page URL and confirming you get clean markdown with frontmatter
Tool access
- Expose search via MCP server or search API
- Add a
SKILLS.mdorAGENTS.mdfile with install, config, and usage instructions for coding agents
To get started, run through the checklist above, then score your site against the Agent-Readability Spec.
- Agent-Readability Spec: Full checklist and 0–100 scoring rubric
- Vercel MCP server: Query Vercel, Next.js, and AI SDK docs via MCP
- llms.txt specification: The llms.txt standard