Karpenter Node Autoprovisioning: Cut Kubernetes Costs 40%+
Ever watched your Kubernetes cluster spin up m5.24xlarge instances for pods that need 500m CPU? Or waited 3 minutes for Cluster Autoscaler to provision a node while your deployment sat pending? If you're running EKS at any real scale, you've felt these pain points. Karpenter fixes both—and typically cuts compute costs 30-60% in the process.
I migrated a 200-node production cluster from Cluster Autoscaler to Karpenter last quarter. Monthly EC2 bill dropped from $47K to $28K. Here's exactly how Karpenter works and how to implement it without torching your production environment.
Why Cluster Autoscaler Falls Short
Cluster Autoscaler was designed for a simpler time. It works with node groups—predefined templates that specify exact instance types. You want m5.large nodes? Create a node group. Need c5.2xlarge for compute workloads? Another node group. GPU instances? You guessed it.
This creates three expensive problems:
Bin-packing inefficiency: When a pod needs 3GB RAM, Cluster Autoscaler might provision a 16GB node because that's what the node group specifies. You're paying for 13GB of unused memory.
Scaling latency: Cluster Autoscaler checks for pending pods every 10-15 seconds, then calls the AWS API, waits for node registration. Real-world time from pod pending to running: 2-4 minutes.
Operational overhead: Managing 15+ node groups, keeping AMIs updated, coordinating instance type availability across AZs—it's a full-time job nobody wants.
Karpenter takes a different approach. No node groups. You define constraints (instance families, sizes, capacity types), and Karpenter provisions the exact right node for pending pods in real-time.
Karpenter Architecture in 60 Seconds
Karpenter runs as a deployment in your cluster with two core components:
The controller watches for pending pods and nodes, making provisioning and deprovisioning decisions. The webhook validates and mutates Karpenter custom resources.
When pods go pending, Karpenter:
- Groups pending pods by scheduling constraints
- Runs a bin-packing simulation across all available instance types
- Selects the cheapest instance(s) that fit the workload
- Calls EC2 directly (not ASG) to launch instances
- Node joins cluster in 45-90 seconds
That direct EC2 integration is key. No Auto Scaling Group overhead, no launch template indirection. Karpenter makes one API call and you get a node.
Installing Karpenter on EKS
Prerequisites: EKS 1.25+, Helm 3.x, and IAM permissions for EC2, IAM, and SSM.
First, set up the IAM roles. Karpenter needs permission to create instances, pass roles, and read SSM parameters for AMI discovery:
export KARPENTER_VERSION="0.37.0"
export CLUSTER_NAME="production-eks"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export KARPENTER_IAM_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}"
# Create the IAM role (using eksctl for brevity—Terraform works too)
eksctl create iamserviceaccount \
--cluster="${CLUSTER_NAME}" \
--name=karpenter \
--namespace=kube-system \
--role-name="KarpenterControllerRole-${CLUSTER_NAME}" \
--attach-policy-arn="arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy" \
--override-existing-serviceaccounts \
--approve
# Install Karpenter via Helm
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version "${KARPENTER_VERSION}" \
--namespace kube-system \
--set "settings.clusterName=${CLUSTER_NAME}" \
--set "settings.interruptionQueue=${CLUSTER_NAME}" \
--set controller.resources.requests.cpu=1 \
--set controller.resources.requests.memory=1Gi \
--set controller.resources.limits.cpu=1 \
--set controller.resources.limits.memory=1Gi \
--wait
The interruption queue handles Spot termination notices. Don't skip it if you're using Spot instances.
Configuring NodePools and EC2NodeClasses
Karpenter uses two CRDs: NodePool defines what workloads can run on provisioned nodes, and EC2NodeClass defines how to configure the underlying EC2 instances.
Here's a production-ready configuration that prioritizes Spot instances, allows fallback to On-Demand, and restricts to current-gen instance families:
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
- key: karpenter.k8s.aws/instance-generation
operator: Gt
values: ["5"]
- key: karpenter.k8s.aws/instance-size
operator: In
values: ["medium", "large", "xlarge", "2xlarge", "4xlarge"]
nodeClassRef:
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
name: default
limits:
cpu: 1000
memory: 2000Gi
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
amiFamily: AL2023
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: "production-eks"
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: "production-eks"
instanceStorePolicy: RAID0
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
iops: 3000
throughput: 125
deleteOnTermination: true
tags:
Environment: production
ManagedBy: karpenter
Key configuration choices here:
- Instance generation > 5: Excludes old instance types (m4, c4) that cost more per vCPU
- Instance categories c/m/r: Compute, general purpose, and memory-optimized covers 95% of workloads
- consolidationPolicy: WhenUnderutilized: Karpenter will actively move pods and terminate underused nodes
- consolidateAfter: 30s: Aggressive consolidation—adjust to 5m for less churn in sensitive environments
Spot Integration and Interruption Handling
Karpenter handles Spot brilliantly. By allowing multiple instance types and sizes, you tap into Spot's diversified allocation strategy—AWS picks from available capacity pools, reducing interruption rates.
Create an SQS queue for interruption handling:
aws sqs create-queue --queue-name "${CLUSTER_NAME}" --attributes '{
"MessageRetentionPeriod": "300",
"SqsAccessPolicy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":[\"events.amazonaws.com\",\"sqs.amazonaws.com\"]},\"Action\":\"sqs:SendMessage\",\"Resource\":\"arn:aws:sqs:us-east-1:'${AWS_ACCOUNT_ID}':'${CLUSTER_NAME}'\"}]}"
}'
# Create EventBridge rules for Spot interruptions and rebalance recommendations
aws events put-rule \
--name "${CLUSTER_NAME}-spot-interruption" \
--event-pattern '{"source":["aws.ec2"],"detail-type":["EC2 Spot Instance Interruption Warning"]}'
aws events put-targets \
--rule "${CLUSTER_NAME}-spot-interruption" \
--targets "Id"="1","Arn"="arn:aws:sqs:us-east-1:${AWS_ACCOUNT_ID}:${CLUSTER_NAME}"
When AWS sends a 2-minute Spot interruption notice, Karpenter cordons the node, drains pods gracefully, and they're rescheduled on new capacity before termination.
Measuring Real Cost Savings
After running Karpenter for 30 days, here's how I measured impact:
Before (Cluster Autoscaler):
- Average node utilization: 34%
- Spot usage: 45% (manually managed node groups)
- Monthly compute: $47,200
After (Karpenter):
- Average node utilization: 71%
- Spot usage: 78% (automatic with fallback)
- Monthly compute: $28,100
The utilization jump comes from right-sizing. Karpenter provisions c5.xlarge for a pod requesting 3 vCPU instead of c5.4xlarge because that's what the node group specified.
Use these commands to monitor your migration:
# Check Karpenter's provisioning decisions
kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter -c controller | grep -E "(Provisioned|Deprovisioned)"
# View current node composition
kubectl get nodes -L karpenter.sh/capacity-type,node.kubernetes.io/instance-type,karpenter.sh/nodepool
# Check consolidation candidates
kubectl get nodes -L karpenter.sh/nodepool -o custom-columns=NAME:.metadata.name,CPU:.status.allocatable.cpu,USED:.status.capacity.cpu
Migration Path from Cluster Autoscaler
Don't rip and replace. Run both systems in parallel:
- Install Karpenter with a NodePool that has a unique taint
- Deploy non-critical workloads with tolerations for that taint
- Monitor for a week—watch provisioning times, Spot interruption handling, consolidation behavior
- Gradually migrate workloads by removing taints from the NodePool
- Scale down Cluster Autoscaler node groups as utilization drops
- Delete Cluster Autoscaler when all workloads run on Karpenter nodes
The parallel approach costs a few extra nodes for a week. That's nothing compared to the risk of a botched migration during peak traffic.
Tag your Karpenter-provisioned nodes and set up a CloudWatch dashboard showing cost by tag. You'll see the crossover point clearly—usually within the first week.
Start with your dev cluster tomorrow. The 20-minute installation pays back in hours of saved node-group wrangling, and the cost savings fund themselves within the first billing cycle.
Written by GeekOnCloud
DevOps & Infrastructure engineer at geekoncloud.com