Fluid compute
Learn how to enable fluid compute, an execution model for Vercel Functions that provides a more flexible and efficient way to run your functions.Fluid compute offers a blend of serverless flexibility and server-like capabilities. Unlike traditional serverless architectures, which can face issues such as cold starts and limited functionalities, fluid compute provides a hybrid solution. It overcomes the limitations of both serverless and server-based approaches, delivering the advantages of both worlds, including:
- Zero configuration out of the box: Fluid compute comes with preset defaults that automatically optimize your functions for both performance and cost efficiency.
- Optimized concurrency: Optimize resource usage by handling multiple invocations within a single function instance. Can be used with the Node.js and Python runtimes.
- Dynamic scaling: Fluid compute automatically optimizes existing resources before scaling up to meet traffic demands. This ensures low latency during high-traffic events and cost efficiency during quieter periods.
- Background processing: After fulfilling user requests, you can continue executing background tasks using
waitUntil
. This allows for a responsive user experience while performing time-consuming operations like logging and analytics in the background. - Automatic cold start optimizations: Reduces the effects of cold starts through automatic bytecode optimization, and function pre-warming on production deployments.
- Cross-region and availability zone failover: Ensure high availability by first failing over to another availability zone (AZ) within the same region if one goes down. If all zones in that region are unavailable, Vercel automatically redirects traffic to the next closest region. Zone-level failover also applies to non-fluid deployments.
See What is Compute? to learn more about fluid compute and how it compares to traditional serverless models.
To enable fluid compute:
- Navigate to your project in the Vercel dashboard.
- Click on the Settings tab and select the Functions section.
- Scroll to the Fluid Compute section and enable the toggle for Fluid Compute.
- Redeploy your project to apply the changes.
Fluid compute is available for the following runtimes:
The in-function concurrency beta ends on 20th February 2025. It is now available in fluid by default and can be enabled in your dashboard.
Fluid compute allows multiple invocations to share a single function instance, this is especially valuable for AI applications, where tasks like fetching embeddings, querying vector databases, or calling external APIs can be I/O-bound. By allowing concurrent execution within the same instance, you can reduce cold starts, minimize latency, and lower compute costs.
Vercel Functions prioritize existing idle resources before allocating new ones, reducing unnecessary compute usage. This in-function-concurrency is especially effective when multiple requests target the same function, leading to fewer total resources needed for the same workload.
Optimized concurrency in fluid compute is available when using Node.js or Python runtimes. See the efficient serverless Node.js with in-function concurrency blog post to learn more.
When using Node.js version 20+, Vercel Functions use bytecode caching to reduce cold start times. This stores the compiled bytecode of JavaScript files after their first execution, eliminating the need for recompilation during subsequent cold starts.
As a result, the first request isn't cached yet. However, subsequent requests benefit from the cached bytecode, enabling faster initialization. This optimization is especially beneficial for functions that are frequently invoked or have long execution times.
For frameworks that output ESM, all CommonJS dependencies
(for example, react
, node-fetch
) will be opted into bytecode caching.
On traditional serverless compute, the isolation boundary refers to the separation of individual instances of a function to ensure they don't interfere with each other. This provides a secure execution environment for each function.
However, because each function uses a microVM for isolation, which can lead to slower start-up times, you can see an increase in resource usage due to idle periods when the microVM remains inactive.
Fluid compute uses a different approach to isolation. Instead of using a microVM for each function invocation, multiple invocations can share the same physical instance (a global state/process) concurrently. This allows functions to share resources and execute in the same environment, which can improve performance and reduce costs.
Fluid Compute includes default settings that vary by plan:
Settings | Hobby | Pro | Enterprise |
---|---|---|---|
CPU configuration | Managed | Standard / Performance | Standard / Performance |
Default / Max duration | 60s / 60s | 90s / 800s | 90s / 800s |
Multi-region failover | |||
Multi-region functions | Up to 3 | All |
When you enable fluid compute, the settings you configure in your function code, dashboard, or vercel.json
file will override the default fluid compute settings.
The following order of precedence determines which settings take effect. Settings you define later in the sequence will always override those defined earlier:
Precedence | Stage | Explanation | Can Override |
---|---|---|---|
1 | Function code | Settings in your function code always take top priority. These include max duration defined directly in your code. | maxDuration |
2 | vercel.json | Any settings in your vercel.json file, like max duration, region, and CPU, will override dashboard and Fluid defaults. | maxDuration , region , memory |
3 | Dashboard | Changes made in the dashboard, such as max duration, region, or CPU, override Fluid defaults. | maxDuration , region , memory |
4 | Fluid defaults | These are the default settings applied automatically when you have Fluid enabled, and do not configure any other settings. |
If you have enabled fluid compute, and then configure your function to less
than 1GB of memory through vercel.json
, concurrency
optimizations will be
disabled.
Was this helpful?