This repository contains a minimal POC implementation for demonstrating a node-pool scaling scheduler by using GKE node auto-provisioning and kubernetes cron jobs.
This POC will scale down the node pool size to ZERO at 5pm (NZST), and bring it back to a normal state (2 nodes) at 8am (NZST) with GPU acceleator. The goal of this solutions is for saving the cost by turning down the GPU nodes after hours.
It also can be implemented by using scaling-schedules [1], but this solution is more kubernetes-friendly.
Node auto-provisioning is a GKE feature to be able to provisioning nodes or managing node pools automatically, and auto-scale the nodes to meet the resource requirements based on workloads [2].
A cronjob is a kubernetes-managed schedule to run a job repeatly based on a Cron format string [3][4].
- Terragrunt (v0.35.10 and above)
- kubectl
cd ./terragrunt-src/non-prod/dev
terragrunt run-all apply
This script will do following things:
- create a new GKE cluster without default node pool
- create a new default node pool with 3 nodes
- create a GPU node pool with nvidia-tesla-p4 accelerator in zone australia-southeast1-b
- the minimul node size is ZERO
- the maximal node size is 5 nodes
cd ./app-demo
kubectl apply -f deployment.yaml
This script will create a deployment in cluster with ZERO replica by default.
cd ./cronjob
kubectl apply -f cron-job.yaml
This script will do:
- create a new service account with cluster-admin permission
- create two cron jobs
- job gpu-service-up-cronjob scales up the deployment to 2 replicas
- job gpu-service-down-cronjob scales down the deployment to ZERO replica
You can manually scale up and down the replica size of application, which will trigger node auto-provisioning feature to provisioning the new node, or remove them from the pool.
# scale up
kubectl scale deployment.apps/api-demo-v3 --replicas=2
# scale down
kubectl scale deployment.apps/api-demo-v3 --replicas=0
- Before scale up
- During scale up
- After scale up
- Before scale down
- During scale down
- After scale dow
-
There is no Google Cloud region has GPUs in all zones. So, we have to initialise node-pool for a single zone with supported accelerator type. https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#gpu_regional_cluster
-
In order to enable auto-provisioning with GPUs, we need to install NVIDIA's device drivers to the node. https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers
-
CRON_TZ=<timezone>
prefix is not available yet until version 1.22. Currently, the latest GKE version is 1.21.5.
- https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#cron-schedule-syntax
- https://stackoverflow.com/questions/68950893/how-can-i-specify-cron-timezone-in-k8s-cron-job
So, you have to use UTC timezone in current GKE version.
- https://cloud.google.com/compute/docs/autoscaler/scaling-schedules
- https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-provisioning
- https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
- https://en.wikipedia.org/wiki/Cron
See the License File.