Code Monkey home page Code Monkey logo

k8s-hoizontalpodautoscaler-example's Introduction

Experimenting with K8s HorizontalPodAutoscaler

HPA docs · HPA v1 API ref · HPA v2 API ref · kubectl autoscale commands · Minikube docs · CDK8s+ docs

Experimenting with K8s HorizontalPodAutoscaler (HPA) by completing the recommended walkthroughs and logging notes in this README a long the way.

🧭 Table of contents

🚀 Quick start

  1. Start Minikube with 2 nodes

    minikube start --nodes 2
  2. Apply the metrics server

    kubectl apply -f src/metrics-server.yaml
  3. Apply the PHP apache application

    kubectl apply -f src/php-apache.yaml
  4. Apply the HorizontalPodAutoscaler

    kubectl apply -f src/hpa.yaml
  5. [Open new terminal] Increase the load on the PHP apache application

    kubectl run -i --tty load-generator --rm --image=busybox:1.28 --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
  6. [Open new terminal] Watch the HorizontalPodAutoscaler scale up

    kubectl get hpa -w
  7. Stop the load generator (terminal used in step 5)

    <Ctrl> + C
  8. Watch the HorizontalPodAutoscaler scale down (terminal used in step 6)

➕ Useful commands

View the HPA status

kubectl describe hpa php-apache

📰 Deploy Kubernetes Dashboard

  1. Apply the service

    kubectl apply -f src/dashboard/service.yaml
  2. Apply the admin-user

    kubectl apply -f src/dashboard/admin-user.yaml
  3. Get the token

    kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')
  4. Start the proxy

    kubectl proxy
  5. Open the dashboard

    http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

🔩 How HPA works

  • The HPA controller periodically queries the metrics API for the current CPU utilization of the pods in the deployment. - Default 15 seconds

  • The algorithm for scaling is: desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

    • The control plane skips any scaling action if the ratio is sufficiently close to 1.0 (within a globally-configurable tolerance, 0.1 by default)._

    • All Pods with a deletion timestamp set (objects with a deletion timestamp are in the process of being shut down / removed) are ignored, and all failed Pods are discarded._

📈 Defining metrics on resources

  • targetAverageValue
  • targetAverageUtilization
  • averageUtilization - Utilization is the ratio between the current usage of resource to the requested resources of the pod.

🚩 HPA flags

  • --horizontal-pod-autoscaler-initial-readiness-delay - default is 30 seconds - determining whether to set aside certain CPU metrics for the first 30 seconds of the pod's life.

  • --horizontal-pod-autoscaler-initial-readiness-delay - default is 5 minutes - Once a pod has become ready, it considers any transition to ready to be the first if it occurred within a this configurable time since it started.

  • --horizontal-pod-autoscaler-downscale-stabilization - default is 5 minutes - The period since the last downscale, before another downscale can be performed in response to a new scale event.

✅ Requirements

  • API objects should follow the same constraints as subdomain names.
    • contain no more than 253 characters
    • contain only lowercase alphanumeric characters, '-' or '.'
    • start with an alphanumeric character
    • end with an alphanumeric character

📛 Changing HPA's target resource names

This can be done in the following way:

  1. Add new name to the HPA target config.
  2. Change the resource name
  3. Remove the old name from the HPA target config.

🆕 Autoscaling v2

  • Supports custom metrics
  • Specify multiple metrics to scale on.
  • Allows setting a behavior for scaling up and down.
  • See status conditions via kubectl describe hpa <name> docs
    • AbleToScale - Indicates whether or not the HPA is able to fetch and update scales, as well as whether or not any backoff-related conditions would prevent scaling.
    • ScalingActive - Indicates whether or not the HPA is enabled (i.e. the replica count of the target is not zero) and is able to calculate desired scales.
    • ScalingLimited - Indicates that the desired scale was capped by the maximum or minimum of the HorizontalPodAutoscaler

🚥 Pod conditions

Useful to know since HPA scales depending on pod readiness. Docs

  • PodScheduled - the Pod has been scheduled to a node.
  • PodHasNetwork - (alpha feature; must be enabled explicitly) the Pod sandbox has been successfully created and networking configured.
  • ContainersReady - all containers in the Pod are ready.
  • Initialized - all init containers have completed successfully.
  • Ready - the Pod is able to serve requests and should be added to the load balancing pools of all matching Services.

🔎 Support for metrics APIs

By default, the HorizontalPodAutoscaler controller retrieves metrics from a series of APIs. In order for it to access these APIs, cluster administrators must ensure that:

  • The API aggregation layer is enabled.

  • The corresponding APIs are registered:

    • For resource metrics, this is the metrics.k8s.io API, generally provided by metrics-server. It can be launched as a cluster add-on.
    • For custom metrics, this is the custom.metrics.k8s.io API. It's provided by "adapter" API servers provided by metrics solution vendors. Check with your metrics pipeline to see if there is a Kubernetes metrics adapter available. See boilerplate to get started
    • For external metrics, this is the external.metrics.k8s.io API. It may be provided by the custom metrics adapters provided above.

🔑 Aggregation layer

Configuring the aggregation layer allows the Kubernetes apiserver to be extended with additional APIs, which are not part of the core Kubernetes APIs. Docs

Note, I was not required to configure this for the metrics-server to work. Instead I disabled the TLS validation by adding a command to the container spec:

 command:
    - /metrics-server
    - --kubelet-insecure-tls
    - --kubelet-preferred-address-types=InternalIP

⚖️ Quantities

All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using a special whole-number notation known in Kubernetes as a quantity. For example, the quantity 10500m would be written as 10.5 in decimal notation. The metrics APIs will return whole numbers without a suffix when possible, and will generally return quantities in milli-units otherwise. This means you might see your metric value fluctuate between 1 and 1500m, or 1 and 1.5 when written in decimal notation.

💡 Possible APIs

We will need an API to create the following:

  • HorizontalPodAutoscaler resource - The HPA object
  • Metric Enum - The metric to scale on
  • Scaling Policy construct - The scaling policy object (used in autoscale/v2's behavior field)
  • Possibly add a maintenanceMode option to Pod/Container resources (to prevent scaling on them). This would be useful for pods that are used for maintenance tasks (e.g. database migrations). See Implicit maintenance-mode deactivation docs

⬆️ Migrating to HPA

Migrating Deployments and StatefulSets to horizontal autoscaling docs - When an HPA is enabled, it is recommended that the value of spec.replicas of the Deployment and / or StatefulSet be removed from their manifest(s). If this isn't done, any time a change to that object is applied, for example via kubectl apply -f deployment.yaml, this will instruct Kubernetes to scale the current number of Pods to the value of the spec.replicas key. This may not be desired and could be troublesome when an HPA is active.

❓ Questions

  • Should we be focused on v2 or v1 of the HPA API?

k8s-hoizontalpodautoscaler-example's People

Contributors

ryparker avatar

Stargazers

Neil Kuan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.