cost_optimize.py
Applies the car-painter KEDA scale-to-zero pattern to your Kubernetes service. Generates keda.tf, http-scaler.yaml, and patches the deployment to set minReplicas: 0. Typical saving: 60–90% compute cost for bursty or low-traffic services.
Generates three artefacts that wire KEDA's HTTP add-on into your service. Pods scale to zero on 5 minutes of idle traffic and spin back up within 60 seconds on the first incoming request.
- Stateful services (databases, message stores)
- Message-queue consumers — use KEDA queue scalers instead
- Services with less than 60-second cold-start tolerance
- Prod services where latency SLOs require pre-warmed replicas
KEDA removes idle pods. Karpenter removes idle nodes. Use both together for maximum savings: KEDA scales your pods to zero, then Karpenter consolidates and terminates the now-empty nodes automatically.
Minimal NodePool that enables Spot + consolidation:
Install: helm install karpenter oci://public.ecr.aws/karpenter/karpenter --version 1.0.0 -n karpenter
Always prefer managed serverless over K8s + KEDA when the workload is stateless HTTP — lower operational overhead and lower cost.