You need a safe, realistic way to load test your Next.js application on Vercel. The tests should validate your application logic without inadvertently stress-testing the Vercel platform or third-party services. Load tests should uncover issues in your code paths, caching, database access, middleware, and rate limiting before real traffic does—all while staying within Vercel's load-testing policy and security features.
Assumptions: this guide assumes you have a Next.js app deployed to Vercel with Preview, Staging, and Production environments, and you can coordinate with any third-party providers used by your app.
- Define scope and guardrails
- Pick tooling and load shapes
- Tests you should run (and how)
- Tests you should not run
- Implement representative scenarios
- Observe, verify, and iterate
- Hardening tops for common bottlenecks
A) Align with Vercel’s policy
If you plan to generate significant load, especially volumetric tests, coordinate first. Load tests that primarily exercise Vercel’s core infrastructure can be blocked by Vercel's DDoS mitigations while failing to validate your application’s implementation or configuration. Target your app’s dynamic routes and integrations instead.
For any test that requires 50k+ requests per second (RPS), a ticket must be opened with Vercel’s engineering team. If your tests will reach this load, please reach out to your primary contact at Vercel to begin this process.
B) Choose the right environment
Run against a Staging or dedicated Preview deployment with production-like configuration and data fixtures. Use Preview deployments to wire tests into CI after each change.
C) Instrument before you test
Enable Vercel Observability Plus for detailed logs, traces, and insights. Add Speed Insights and Web Analytics to track traffic across pages and see fluctuations in performance metrics. If you need vendor APMs, enable OpenTelemetry using Vercel OpenTelemetry Collector.
Vercel’s monitoring and observability tools can accrue significant cost during high traffic events, such as load testing. You will be responsible for any accrued cost, so be sure to turn off these monitoring tools or adjust sample rates to ensure costs stay in line with your team’s expectations and budget.
Use Observability during tests to correlate spikes with function logs and traces, not just RPS charts.
D) Configure System Bypass and WAF bypass rules
To ensure your tests are not automatically blocked by your Vercel’s DDoS mitigations or your WAF’s managed or custom rules, add a System Bypass rule and or a custom bypass rule, respectively. If possible, set the specific IP address or range of IP addresses you plan to send traffic from. Without these bypass rules in place, your load test requests may be identified as an attack or matched with WAF rules and blocked.
- Use a battle-tested tool like Artillery or k6.
- Prefer gradual ramps, stepped increases, and realistic traffic profiles.
- Avoid instant 0 → 100k RPS bursts that you will not see in production.
- This keeps focus on your code paths rather than Vercel’s edge network capacity.
Artillery example (ramped, realistic):
config: target: <https://your-staging-deployment.vercel.app> phases: - duration: 600 # 10 min arrivalRate: 20 # start ~20 rps rampTo: 200 # ramp to ~200 rps defaults: headers: x-test-run: "loadtest-2025-09-04"scenarios: - name: "Dynamic API" flow: - get: url: "/api/search?q=test" - name: "Auth flow" flow: - get: { url: "/login" } - post: url: "/api/login" json: { email: "test@example.com", password: "secret" } - get: { url: "/account" }k6 example targeting a Route Handler with auth cookie:
import http from "k6/http";import { check, sleep } from "k6";
export const options = { scenarios: { ramped: { executor: "ramping-arrival-rate", startRate: 20, timeUnit: "1s", preAllocatedVUs: 100, maxVUs: 300, // adjust durations and targets to meet your desired RPS // and total number of requests stages: [ { duration: "5m", target: 200 }, // ramp to 200 it/s { duration: "10m", target: 200 }, { duration: "3m", target: 0 } ] } }, thresholds: { http_req_failed: ["rate<0.01"], http_req_duration: ["p(95)<4000", "p(99)<8000"] }};
export default function () { const headers = { Cookie: `session=stubbed.token` }; const res = http.get(`${__ENV.BASE_URL}/api/orders?limit=10`, { headers }); check(res, { "status is 200/304": r => r.status === 200 || r.status === 304 }); sleep(1);}Run with:
k6 run -e BASE_URL="<https://your-staging-deployment.vercel.app>" script.jsGoal: Validate latency percentiles, error handling, retries, and vendor rate limits.
How: Target Route Handlers or API routes that call your owned and managed external systems. Track p95/p99 latency and error rates.
DB pooling: In serverless environments, each invocation can open a new DB connection. Use connection-pooled approaches (HTTP database APIs, Prisma Accelerate/Prisma Postgres, or provider-native pooling) to avoid exhausting connections under concurrency. Verify behavior at increasing concurrency.
Vercel Functions can be configured to run on Fluid Compute, which scales concurrency dynamically. Validate that your DB layer and external APIs scale with that concurrency.
Goal: Confirm time-based revalidation, stale-while-revalidate behavior, and “thundering herd” scenarios.
How: In App Router, set revalidate on your fetch or route segment.
export const revalidate = 60; // seconds
async function getData() { const res = await fetch("<https://api.example.com/data>", { next: { revalidate: 60 }, // time-based revalidation }); return res.json();}Test plan:
- Send a single request to warm the page.
- After TTL expires, send a moderate burst. Expect stale content served while Next.js revalidates in the background, then refresh shows fresh content.
- Increase concurrency to ensure only one regeneration occurs while others get stale content.
Review: Next.js caching, time-based revalidation, and on-demand revalidation with revalidatePath and revalidateTag.
Goal: Ensure application logic is safe under concurrent requests, including state, file handles, and connection limits.
How: Ramp up virtual users and verify no file descriptor or similar limits are hit. Check logs for spikes or failures.
Vercel Functions have a limit of 1,024 file descriptors shared across all concurrent executions. This limit includes file descriptors used by the runtime itself, so the actual number available to your application code will be strictly less than 1,024.
See more info here.
Goal: Ensure global middleware does not become a bottleneck and that auth redirects and cookies behave correctly under load.
How:
- Exercise routes protected by Middleware.
- Keep Middleware lightweight and avoid heavy CPU or blocking I/O. The edge-runtime Middleware is executed globally before cache and has fair-use CPU limits.
- If you use Redis or KV for sessions, validate store hit rate and latency.
Goal: Payments, CMS, search, analytics resilience and fallbacks.
How:
- Coordinate with vendors or use sandbox endpoints.
- Validate backoff, retry, and circuit-breaker logic when vendors throttle or error.
- For rate limits, prefer vendor test modes where available.
Goal: Prove your protection works under bursts.
How:
- If using Vercel WAF, remove any configured custom bypass rules so your test traffic is identified again.
- Simulate realistic traffic spikes that hit your rate limit and confirm additional requests are rate-limited until the measurement period resets.
- Do not load test Vercel’s core infra such as raw CDN throughput or edge network capacity. That is not representative of your app’s performance and may be blocked.
- Avoid benchmarking fully cached static assets. You will mostly measure the CDN, not your code paths. Use functional flows instead.
- Do not craft unrealistic traffic shapes like instant 0 → 100k RPS spikes. Prefer staged ramps that mimic expected usage.
- Do not hit production third-party endpoints without explicit coordination. It can trigger fraud rules or bans.
Create small suites that exercise key user flows, such as product browsing, login, search, or other key flows in your application.
Make sure to also consider specific scenarios in which your implementation might be impacted by traffic, such as
- Retries
- Traffic spikes from a single IP (if rate limit rules are in place)
- Geo-specific marketing campaigns
- Authenticated endpoints
When conducting your load tests, it is vital to understand not only the result of the tests, but also how the simulated traffic impacts your application and downstream services.
- Check logs and traces: During the test, watch logs and traces to correlate slow endpoints and error spikes.
- Watch Speed Insights and Web Analytics: Check how new deployments affect performance and where traffic lands.
- Validate caching outcomes: Confirm time-based revalidation and on-demand invalidation with revalidatePath or revalidateTag.
- Check function scaling and limits: Ensure there is no file descriptor or memory contention under concurrency.
- Verify WAF/rate-limit behavior: Confirm that burst traffic is throttled or challenged as configured.
- Middleware:
- Keep logic minimal. It runs before cache and can add latency.
- Respect fair-use CPU limits.
- Use matchers to scope only where needed.
- DB access: Adopt connection-pooled or HTTP-based drivers and validate under Fluid Compute concurrency.
- Caching: Prefer revalidate on fetch or segment; use on-demand invalidation for content updates.
- Security:
- Use Vercel WAF custom rules, the Bot Protection Managed Ruleset, BotID, and Attack Challenge Mode when appropriate.
- Do not leave challenge mode enabled permanently unless under attack.
- If needed, utilize Trusted IPs to circumvent trusted traffic from known IPs from being denied or triggering your configured WAF rules.
System Bypass Rules will not bypass Attack Challenge Mode. Be sure to disable when load testing your application.
After following this guide, you will have:
- A reproducible, realistic load test suite that targets your application’s dynamic routes, DB/API integrations, middleware, and rate limiting.
- Verified ISR TTLs and stale-while-revalidate behavior under load without triggering a thundering herd.
- Observability dashboards and traces that correlate load to function latency, errors, and scaling, enabling you to tune code, caching, and rules with confidence.
- Confirmed WAF and app-level rate limiting behavior under burst scenarios.
- https://vercel.com/docs/otel
- https://vercel.com/docs/analytics
- https://vercel.com/docs/speed-insights
- https://vercel.com/docs/observability/observability-plus
If you encounter issues not covered by this guide:
- Visit Vercel Support.
- Check Vercel Community.
- As an enterprise customer, you can also open a ticket directly from your Vercel Dashboard.