VercelVercel
Caching

Caching

Last updated March 5, 2026

Vercel caches content at multiple layers between the visitor and your backend. The CDN checks each layer in order and returns a cached response as soon as one is available.

The diagram below shows how a request flows through the cache layers. Select a workload type to see which layers apply.

How Caching Works

Next.js, SvelteKit, and other frameworks use Incremental Static Regeneration (ISR) to cache pre-rendered pages and update content on demand without redeploying.

Close to user
Client
PoP
Vercel Region
CDN CacheGlobal across all Vercel regions for fast delivery
Function Region
ISR CacheCan be revalidated on-demand or based on time
Vercel Function
Runtime CacheFor data used in Vercel Functions
Backend

The CDN cache stores responses across Vercel regions worldwide. When a visitor makes a request, the nearest PoP routes it to a Vercel region in single-digit milliseconds. On a cache hit, the region returns the response with no round trip to your function or origin.

You control CDN caching through Cache-Control headers or your framework's built-in caching.

The ISR cache stores pre-rendered pages in durable storage within a single function region selected from your configured list. Frameworks like Next.js and SvelteKit use ISR to generate pages at build time and update them on demand or on a schedule.

When a page isn't in the CDN cache, the CDN checks the ISR cache next. Your function only runs when content needs regenerating.

The runtime cache stores data fetched inside Vercel Functions. Your framework's data-fetching API activates it when it opts into caching, such as Next.js fetch with force-cache. You can also call the runtime cache API directly. The function region caches the response for subsequent requests.

This reduces latency for repeated data lookups and lowers the number of calls to external APIs and databases.

The image cache stores optimized images after Vercel transforms them. When you use Image Optimization, Vercel resizes, compresses, and converts images on the first request. Subsequent requests return the cached result and skip processing.

When multiple visitors request the same uncached content at the same time, request collapsing groups those requests into one call to your backend. This protects your origin from traffic spikes and avoids redundant work.


Was this helpful?

supported.