# How Fluid compute pricing compares to EC2, Fargate, and EKS

**Author:** Joe Haddad, Ben Sabic

---

Comparing [Fluid compute](https://vercel.com/fluid) to public cloud compute on a pricing page is misleading, because Amazon EC2, AWS Fargate, and Amazon EKS bill for the capacity you provision, not for the CPU your application actually uses. Fluid compute with Active CPU pricing bills only for CPU that's actively running your code, so a fair comparison measures what you pay per unit of CPU your app actually consumes.

This guide compares all four options on that basis, using a like-for-like 1 vCPU / 2 GB shape, and shows how the result changes as real-world CPU utilization drops.

## Overview

In this guide, you'll learn:

- Why provisioned cloud pricing understates the cost of the CPU you actually use
  
- How to compare compute options on cost per active vCPU-hour
  
- How Fluid's all-in rate compares to EC2, Fargate, and EKS at realistic utilization levels
  
- What CPU utilization levels are realistic for API and AI workloads
  

## The gap between what you provision and what you use

EC2, Fargate, and EKS charge for the capacity you provision, whether or not your code is using it. When an instance runs at 40% CPU utilization, you still pay for 100% of the instance, and the remaining 60% is idle capacity you've already paid for. The published hourly rate, therefore, describes the cost of provisioned CPU, not the cost of the CPU that does your application's work.

Fluid compute prices differently. With Active CPU pricing, Vercel bills for CPU only while your code is running, pauses CPU billing during I/O waits, and charges nothing for CPU between requests. You still pay for provisioned memory while a request is in flight, at $0.0106 per GB-hour, less than 10% of the active CPU rate. This memory cost is included in the all-in figures below.

Because of this difference, the two models compare fairly only on the cost of the CPU your application actually consumes, not on the sticker rate.

## The metric that makes the comparison fair: cost per active vCPU-hour

A fair comparison prices the CPU that actually ran your application's code, not the capacity you reserved. That unit is an active vCPU-hour: one hour of vCPU time spent executing your workload.

Fluid already bills in this unit. Active CPU pricing charges for CPU only while your code runs, so Fluid's cost per active vCPU-hour is at most its published all-in rate, at any utilization level.

Provisioned options bill in a different unit: capacity-hours, used or not. To express their cost in active vCPU-hours, divide the provisioned rate by CPU utilization:

`cost per active vCPU-hour = provisioned hourly rate / CPU utilization`

At 50% utilization, half the capacity you bought did no work, so each active vCPU-hour costs twice the sticker rate. At 25% utilization, it costs 4x.

## How Fluid's all-in rate is calculated

For a Standard machine size (1 vCPU / 2 GB), this guide uses $0.149 per active vCPU-hour as Fluid's all-in rate: $0.128 for one hour of active CPU plus $0.021 for 2 GB of provisioned memory ($0.0106 per GB-hour times 2 GB).

This is a deliberately conservative figure. It assumes one request per instance, so every active vCPU-hour carries the full memory charge. In practice, Fluid runs many concurrent requests on the same instance, which spreads the memory cost across them and pushes the effective rate toward $0.128. And because Active CPU pricing never bills idle CPU, the rate doesn't rise when utilization falls. $0.149 is a ceiling, not an average.

## What CPU utilization is realistic?

Most production fleets run well below the utilization targets they're configured for. The comparison below uses three levels that bracket the realistic range.

**40% is a generous ceiling.** AWS's [target tracking documentation](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-target-tracking.html) uses 50% CPU as a normal scaling target, and notes that actual capacity often sits below the target because autoscaling rounds capacity up and scales in cautiously. Google documents the same [gap between target and actual utilization](https://cloud.google.com/compute/docs/autoscaler/understanding-autoscaler-decisions) in its autoscaler.

**30% is a mid-range operating point for API and AI services.** For Node.js API services, CPU isn't the signal users feel. Users experience event loop pressure and tail latency, which can [diverge from CPU usage](https://nodesource.com/blog/event-loop-utilization-nodejs) enough for CPU-based scaling to react to the wrong signal, so teams scale on other signals and CPU sits lower. AI workloads push utilization lower still, because requests spend variable time waiting on inference rather than burning local CPU.

**8% is what measured Kubernetes fleets actually average.** CAST AI's [2026 analysis of measured clusters](https://cast.ai/blog/2026-state-of-kubernetes-resource-optimization-cpu-at-8-memory-at-20-and-getting-worse) put average CPU utilization at 8%.

## How the comparison changes with utilization

The table below prices each option at the three utilization levels above, using the 1 vCPU / 2 GB shape. Lower is cheaper. Fluid stays at or below $0.149 because idle CPU isn't billed and concurrency only lowers the effective rate, while the provisioned options get more expensive per active vCPU-hour as utilization falls.

| Cost per active vCPU-hour    | 40% utilization | 30% utilization | 8% utilization |
| ---------------------------- | --------------- | --------------- | -------------- |
| Fluid (Active CPU pricing)   | $0.149          | $0.149          | $0.149         |
| Amazon EC2 (c8i)             | $0.117          | $0.156          | $0.586         |
| AWS Fargate (Linux/x86, 1:2) | $0.123          | $0.164          | $0.617         |
| Amazon EKS (Auto Mode, c8i)  | $0.131          | $0.175          | $0.656         |

At 40% utilization, the provisioned options beat Fluid's ceiling rate per active vCPU-hour: EC2 by 21.5%, Fargate by 17.3%, and EKS by 12.1%. Few production fleets sustain this level, and with concurrency Fluid's effective rate drops toward $0.128, closing most of the gap.

At 30% utilization, Fluid is the cheaper option: EC2 costs 4.7% more, Fargate 10.1% more, and EKS 17.3% more per active vCPU-hour.

At 8% utilization, the gap widens sharply. EC2 costs 293% more than Fluid, Fargate 314% more, and EKS 340% more, because most of the provisioned capacity sits idle while still being billed.

Fluid doesn't win every row. At high sustained utilization, provisioned compute is cheaper per active vCPU-hour. What the table shows is that across the utilization levels teams actually run at, Fluid lands in the same range or lower, because it removes idle CPU from the bill.

## Beyond the per-hour rate: operational cost

The per-hour comparison leaves out operational overhead, which favors Fluid further. Fluid compute is fully managed, so you don't tune autoscaling policies, manage load balancing, or debug the failure modes that come from running too hot (downtime) or too cold (wasted spend). Operating EC2, Fargate, or EKS at a chosen utilization target is itself ongoing engineering work, and the gap between the target and what you achieve is the idle cost captured in the table above.

## Further reading

- Read how [Active CPU pricing](https://vercel.com/blog/introducing-active-cpu-pricing-for-fluid-compute) works and why it reduces costs for long-running I/O-bound and agentic workloads.
  
- See the [Fluid compute pricing documentation](https://vercel.com/docs/functions/usage-and-pricing) for current active CPU, provisioned memory, and invocation rates.
  
- Learn how [Fluid compute](https://vercel.com/docs/fluid-compute) uses concurrency and dynamic scaling to reduce the capacity you provision.

---

[View full KB sitemap](/kb/sitemap.md)