Skip to content

Make your documentation readable by AI agents

Serve markdown to AI agents using content negotiation, .md endpoints, agent auto-detection, llms.txt, sitemap.md, and MCP. Includes a checklist, implementation patterns, and a compliance framework for measuring agent readiness across content platforms.

5 min read
Last updated March 27, 2026

AI agents (Claude Code, ChatGPT, Cursor, Copilot) are a primary consumer of developer documentation. They don't need navigation chrome, dark mode toggles, or animated code blocks. They need:

  • Discoverable content: Where the docs are and what they cover
  • Clean retrieval: Markdown, not a DOM tree
  • Structured metadata: Version, last updated, and canonical URL
  • Tool access: Search and fetch via protocol, not scraping

Most content platforms serve agents the same HTML page they serve humans. The agent then spends tokens stripping tags, guessing at content boundaries, and hoping the important information survives extraction. The result: hallucinated APIs, outdated code examples, and missed context.

Agents interact with content in three layers. They discover what exists, retrieve clean content, optionally index at scale, and use tools for precision queries.

Help agents find what exists before they fetch anything.

RequirementImplementationPurpose
/llms.txtCurated markdown index at the site root with section headings and linksEntry point for agents and humans pasting docs into an IDE
sitemap.xmlStandard sitemap with accurate lastmod datesFreshness tracking for agents that monitor content changes
sitemap.mdSemantic sitemap served as markdown with section headings, categories, and page descriptionsHigh-level orientation for LLM-assisted navigation and contributor onboarding
robots.txtDocumented stance on agent access (which bots can crawl, which pages are off-limits)Crawl control and access transparency
JSON-LD structured dataTitle, description, canonical URL, and breadcrumbs on every HTML pageAgents that parse HTML can understand page type and relationships without traversing the DOM

Example JSON-LD for a docs page:

{
"@context": "<https://schema.org>",
"@type": "TechArticle",
"headline": "Vercel Functions",
"description": "Deploy server-side code on Vercel.",
"url": "<https://vercel.com/docs/functions>",
"breadcrumb": {
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Docs",
"item": "<https://vercel.com/docs>"
},
{
"@type": "ListItem",
"position": 2,
"name": "Functions",
"item": "<https://vercel.com/docs/functions>"
}
]
}
}

This is the highest-impact layer. When an agent fetches a docs page, it should get markdown, not HTML with embedded scripts.

MechanismHow it works
Content negotiationAgents that send Accept: text/markdown receive markdown with Content-Type: text/markdown; charset=utf-8 and a Vary: Accept header. Claude Code sends this header natively.
Agent auto-rewriteDetected AI agents receive markdown automatically, even without an Accept: text/markdown header. Detection uses user-agent matching, RFC 9421 Signature-Agent headers, and a heuristic fallback for unknown agents.
.md endpointsAppend .md to any docs URL to get the markdown version directly. Works without custom headers. Useful for pasting docs into a chat or IDE.
Rich frontmatterEvery markdown response includes metadata for accurate citations.
HTML alternate linkAdding <link rel="alternate" type="text/markdown"> in the HTML <head> tells agents a markdown version exists.

Example request and response:

Terminal
curl -H "Accept: text/markdown" <https://vercel.com/docs/functions>
/docs/functions.md
---
title: Functions
description: Deploy server-side code on Vercel.
canonical_url: <https://vercel.com/docs/functions>
md_url: <https://vercel.com/docs/functions.md>
last_updated: 2026-01-15T12:34:56.000Z
---
# Functions
Deploy server-side code on Vercel...

Without canonical_url and last_updated, agents can't link back to the source or judge whether their information is stale.

The Vary: Accept header tells CDNs to cache HTML and markdown responses separately for the same URL. Without it, an agent's markdown response could be served to a browser, or vice versa.

Content negotiation and .md endpoints cover agents that explicitly request markdown. Many agents don't. They send a standard GET with no special headers. If your platform can detect that the request comes from an AI agent, serve markdown anyway.

Vercel uses a three-layer detection approach:

  1. User-agent matching: Check against a maintained list of known AI agent strings (Claude, ChatGPT, GPTBot, Cursor, Copilot, and others). This is the most reliable signal.
  2. Signature-Agent header: The RFC 9421 standard header, used by ChatGPT's agent. Validate against known AI service domains.
  3. Heuristic fallback: If the request is missing the sec-fetch-mode header (which real browsers always send) and the user-agent matches a bot-like pattern, treat it as an agent. This catches unknown agents at the cost of occasional false positives, which are low-harm since serving markdown to a non-AI bot has no negative effect.

This means agents get markdown on every valid docs request, regardless of how they make it.

When an agent requests a page that doesn't exist, an HTML 404 page is useless. The agent can't parse it, and the conversation stalls. Instead, return a markdown response with actionable content:

  1. Search your docs index for pages similar to the requested path.
  2. If a result scores above a high-confidence threshold (0.99+), redirect the agent to the correct page.
  3. Otherwise, return markdown-formatted suggestions listing the closest matches as links.
  4. Return a 200 status, not 404. Agents need content they can act on.
  5. Append a sitemap footer to every markdown response so agents can browse the full index if they hit a dead end.

The most sophisticated agents don't scrape at all. They call tools.

ToolDescription
MCP serversSearch docs, fetch specific pages, and list available content through a standard protocol. The Vercel MCP server covers vercel.com/docs, nextjs.org, and the AI SDK docs.
Search APIsReturn structured JSON results with canonical URLs, snippets, and freshness metadata.
AI ChatConversational interface on docs sites backed by doc-aware tools: search_docs, get_doc_page, and list_docs. Available to both humans and agents.

For the full scoring rubric (0–100) and detailed verification steps, see the Agent-Readability Spec.

Discovery

  • Serve /llms.txt with a curated H1 + H2 index of your content
  • Publish sitemap.xml with accurate lastmod dates
  • Serve /sitemap.md with a semantic, markdown-formatted sitemap describing docs sections and pages
  • Document agent access policy in robots.txt
  • Add JSON-LD structured data (title, description, canonical URL, and breadcrumbs) to every page

Retrieval

  • Return markdown for Accept: text/markdown with a Vary: Accept header
  • Generate .md endpoints for all content pages
  • Include frontmatter metadata (title, canonical_url, last_updated) in every markdown response
  • Add <link rel="alternate" type="text/markdown"> to HTML pages
  • Detect AI agents (user-agent matching, Signature-Agent header, heuristic fallback) and serve markdown automatically
  • Verify by appending .md to any page URL and confirming you get clean markdown with frontmatter

Tool access

  • Expose search via MCP server or search API
  • Add a SKILLS.md or AGENTS.md file with install, config, and usage instructions for coding agents

To get started, run through the checklist above, then score your site against the Agent-Readability Spec.

Was this helpful?

supported.

Read related documentation

No related documentation available.

Explore more guides

No related guides available.