Code Monkey home page Code Monkey logo

azure-k8s-metrics-adapter's Introduction

CircleCI GitHub (pre-)release

๐Ÿšง โš ๏ธ This project was an exploration for Azure integration with the Kubernetes HPA and has been in Alpha status. It was a success and helped inspire other projects in this solution space. It is now in maintaince mode and will not be getting any new updates given that KEDA (CNCF Sandbox Project) has all the features in this solution. Please checkout KEDA's Scalers for Service Bus Subscriptions and Queues, Azure Monitor, and more. Thanks for all your support and contributions ๐ŸŽ‰

Azure Kubernetes Metrics Adapter

An implementation of the Kubernetes Custom Metrics API and External Metrics API for Azure Services.

This adapter enables you to scale your application deployment pods running on AKS using the Horizontal Pod Autoscaler (HPA) with External Metrics from Azure Resources (such as Service Bus Queues) and Custom Metrics stored in Application Insights.

Try it out:

This was build using the Custom Metric Adapter Server Boilerplate project. Learn more about using an HPA to autoscale with external and custom metrics.

Project Status: Alpha

Walkthrough

Try out scaling with External Metrics using the a Azure Service Bus Queue in this walkthrough.

Try out scaling with Custom Metrics using Requests per Second and Application Insights in this walkthrough

Quick-Start Deploy

This describes the basic steps for deploying the metric adapter. For full deployment details see how to set up on your AKS Cluster and checkout the samples. Make sure the Metric Server is already deployed to your cluster.

Create a Service Principle and Secret:

az ad sp create-for-rbac -n "azure-k8s-metric-adapter-sp" --role "Monitoring Reader" --scopes /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}

#use values from service principle created above to create secret
kubectl create secret generic azure-k8s-metrics-adapter -n custom-metrics \
  --from-literal=azure-tenant-id=<tenantid> \
  --from-literal=azure-client-id=<clientid>  \
  --from-literal=azure-client-secret=<secret>

Deploy the adapter:

kubectl apply -f https://raw.githubusercontent.com/Azure/azure-k8s-metrics-adapter/master/deploy/adapter.yaml

Deploy a metric configuration (requires you to configure the file below with your settings to a Service Bus Queue):

kubectl apply -f https://raw.githubusercontent.com/Azure/azure-k8s-metrics-adapter/master/samples/resources/externalmetric-example.yaml

There is also a Helm chart available for deployment for those using Helm in their cluster.

Deploy a Horizontal Pod Auto Scaler (HPA) to scale of your external metric of choice:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
 name: consumer-scaler
spec:
 scaleTargetRef:
   apiVersion: extensions/v1beta1
   kind: Deployment
   name: consumer
 minReplicas: 1
 maxReplicas: 10
 metrics:
  - type: External
    external:
      metricName: queuemessages
      targetValue: 30

Checkout the samples for more examples and details.

Verifying the deployment

You can also can query the api to verify it is installed:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq .
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq .

To Query for a specific custom metric:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/test/pods/*/custom-metric" | jq .

To query for a specific external metric:

kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/test/queuemessages" | jq .

External Metrics

Requires k8s 1.10+

See a full list of hundreds of available azure external metrics that can be used.

Common external metrics to use for autoscaling are:

Custom Metrics

Custom metrics are currently retrieved from Application Insights. View a list of basic metrics that come out of the box and see sample values at the AI api explorer.

Common Custom Metrics are:

  • Requests per Second (RPS) - example

Azure Setup

Security

Authenticating with Azure Monitor can be achieved via a variety of authentication mechanisms. (full list)

Use one of the following options:

The Azure AD entity needs to have Monitoring Reader permission on the resource group that will be queried. More information can be found here.

Using Azure AD Pod Identity

aad-pod-identity is currently in beta and allows to bind a user managed identity or a service principal to a pod. That means that instead of using the same managed identity for all the pod running on a node like explained above, you are able to get a specific identity with specific RBAC for a specific pod.

Using this project requires to deploy a bit of infrastructure first. You can do it following the Get started page of the project.

Once the aad-pod-identity infrastructure is running, you need to create an Azure identity scoped to the resource group you are monitoring:

az identity create -g {ResourceGroup1} -n custom-metrics-identity

Assign Monitoring Reader to it

az role assignment create --role "Monitoring Reader" --assignee <principalId> --scope /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}

Note: you need to assign this role to all resources groups that you want the identity to be able to read Azure Monitor data.

As documented here aad-pod-identity uses the service principal of your Kubernetes cluster to access the Azure resources. You need to give this service principal the rights to use the managed identity created before:

az role assignment create --role "Managed Identity Operator" --assignee <servicePrincipalId> --scope /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/custom-metrics-identity

Install the Azure Identity to your Kubernetes cluster:

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentity
metadata:
  name: custom-metrics-identity
spec:
  type: 0
  ResourceID: /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/custom-metrics-identity
  ClientID: <clientid>

Install the Azure Identity Binding on your Kubernetes cluster:

apiVersion: "aadpodidentity.k8s.io/v1"
kind: AzureIdentityBinding
metadata:
  name: custom-metrics-identity-binding
spec:
  AzureIdentity: custom-metrics-identity
  Selector: custom-metrics-identity

Note: pay attention to the name of the selector above. You will need to use it to bind the identity to your pod.

If you use the Helm Chart to deploy the custom metrics adapter to your Kubernetes cluster, you can configure Azure AD Pod Identity directly in the config values:

azureAuthentication:
  method: aadPodIdentity
  # if you use aadPodIdentity authentication
  azureIdentityName: "custom-metrics-identity"
  azureIdentityBindingName: "custom-metrics-identity-binding"
  # The full Azure resource id of the managed identity (/subscriptions/{SubID}/resourceGroups/{ResourceGroup1}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{IdentityName})
  azureIdentityResourceId: ""
  # The Client Id of the managed identity
  azureIdentityClientId: ""

Switch method to aadPodIdentity a give the value for the Azure Identity resource id and client id, for example:

helm install ./charts/azure-k8s-metrics-adapter --set azureAuthentication.method="aadPodIdentity" \
  --set azureAuthentication.azureIdentityResourceId="/subscriptions/{SubID}/resourceGroups/{ResourceGroup1}/providers/Microsoft.ManagedIdentity/userAssignedIdentities/{IdentityName}" \
  --set azureAuthentication.azureIdentityClientId="{ClientId}" \
  --name "custom-metrics-adapter"

Using Azure AD Application ID and Secret

See how to create an example deployment.

Create a service principal scoped to the resource group the resource you monitoring and assign Monitoring Reader to it:

az ad sp create-for-rbac -n "adapter-sp" --role "Monitoring Reader" --scopes /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}

Required environment variables:

  • AZURE_TENANT_ID: Specifies the Tenant to which to authenticate.
  • AZURE_CLIENT_ID: Specifies the app client ID to use.
  • AZURE_CLIENT_SECRET: Specifies the app secret to use.

Deploy the environment variables via secret:

 kubectl create secret generic azure-k8s-metrics-adapter -n custom-metrics \
  --from-literal=azure-tenant-id=<tenantid> \
  --from-literal=azure-client-id=<clientid>  \
  --from-literal=azure-client-secret=<secret>

Azure AD Application ID and X.509 Certificate

Required environment variables:

  • AZURE_TENANT_ID: Specifies the Tenant to which to authenticate.
  • AZURE_CLIENT_ID: Specifies the app client ID to use.
  • AZURE_CERTIFICATE_PATH: Specifies the certificate Path to use.
  • AZURE_CERTIFICATE_PASSWORD: Specifies the certificate password to use.

Subscription Information

The use the adapter your Azure Subscription must be provided. There are a few ways to provide this information:

  • Azure Instance Metadata - If you are running the adapter on a VM in Azure (for instance in an AKS cluster) there is nothing you need to do. The Subscription Id will be automatically picked up from the Azure Instance Metadata endpoint
  • Environment Variable - If you are outside of Azure or want full control of the subscription that is used you can set the Environment variable SUBSCRIPTION_ID on the adapter deployment. This takes precedence over the Azure Instance Metadata.
  • On each HPA - you can work with multiple subscriptions by supplying the metric selector subscriptionID on each HPA. This overrides Environment variables and Azure Instance Metadata settings.

FAQ

  • Can I scale with Azure Storage queues?
  • The metrics numbers look slightly off compared to portal or Service Bus Explorer. Why are the values not exact?
    • Azure Monitor has a delay (30s - 2 mins) in reported values. This delay can also be seen in the Azure Monitor dashboard in the portal. There is also a delay in the values reported when using Application Insights.

Contributing

See Contributing for more information.

Issues

Report any issues in the Github issues.

Roadmap

See the Projects tab for current roadmap.

Reporting Security Issues

Security issues and bugs should be reported privately, via email, to the Microsoft Security Response Center (MSRC) at [email protected]. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Further information, including the MSRC PGP key, can be found in the Security TechCenter.

azure-k8s-metrics-adapter's People

Contributors

akshaysngupta avatar avik-so avatar bgpat avatar billpratt avatar jcorioland avatar jsturtevant avatar knee-berts avatar lee0c avatar marc-sensenich avatar nschonni avatar tomkerkhove avatar toyota790 avatar worldspawn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azure-k8s-metrics-adapter's Issues

External metric not working for Eventhubs

Describe the bug
I've setup the metric adapter successfully and have established external metrics for services such as service bus, application gateway. But however When I tried to setup one for eventhub for the metric "IncomingMessages" I was displayed with an error "Error from server (ServiceUnavailable): the server is currently unable to handle the request"

Below is my external metric YAML.

apiVersion: azure.com/v1alpha2
kind: ExternalMetric
metadata:
  name: myextmetric
  namespace: mynamespace
spec:
  type: azuremonitor
  azure:
    resourceGroup: myrg
    resourceName: evhubnamespacename
    resourceProviderNamespace: Microsoft.EventHub
    resourceType: namespaces
  metric:
    metricName: IncomingMessages
    aggregation: Total
    filter: EntityName eq 'customname'

Error: deployment of azureexternalmetric instead of actual deployments

Describe the bug
We are attempting to use horizontal pod autoscaling in our cluster. The cluster autoscaler yaml is already deployed to our cluster but not through helm. However the HorizontalPodAutoscaler yamls are in our charts folder and are deployed through helm in the same namespace that the normal deployments should be in. In addition, i have just deployed azure metrics adapter by cloning the repo and running helm upgrade on it. However, what seems to be happening is instead of the deployments getting pushed to the cluster, the azure metrics adapter pushes its own deployment with the name of namespace im trying to push all the deployments to. And then I dont see my deployments pushed.

To Reproduce
Have a cluster that has the clusterautoscaler
run helm upgrade charts/azure-k8s-metrics-adapter which proper values for auth etc in namespace custom-metrics
run helm upgrade charts/ which have yamls for deployments and hpa's that link to those specific deployments in namespace

Expected behavior
using the aem, the hpas should connect to the deployments and they should deploy. instead, the azure-k8s-adapter deploys its own deployment.

One attempt at fixing this was to remove the deployment.yaml file from the charts/azure-k8s-metrics-adapter repo however, that made it so that i got the error: "deployments/scale.extensions not found"

This whole process was in an attempt to use helm to deploy the aem. Previously the yaml for the aem was generated through the command line using helm install charts/azure-k8s-metric-adapter with appropriate tags necessary.

This produced:
image
However, this is an attempt to generate the appropriate yaml through charts in a CI/CD pipeline. When done this way, there doesn't seem to be the custom-role-bindings:
image

I have doublechecked that all tags are the same and auth is the same.

Kubernetes version (kubectl version):

  • 1.10+

Logs (kubectl logs <metric adapter pod id>)

Additional context
Add any other context about the problem here. (HPA, Custom/External Metric CRD, azure auth type, etc.)

Status code not handled properly on call to App Insights

When calling to Application Insights if call is successful but does not return 200 then the value is returned as zero and call is not logged as failing:

Trace[619846051]: [10.731063787s] [10.730963384s] Listing from storage done
I0904 18:51:43.637973 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: performanceCounters-requestsPerSecond, selectors: app=app,release=rps-example
I0904 18:51:43.638005 1 aiapiclient.go:52] request to:  https://api.applicationinsights.io/v1/apps//metrics/performanceCounters/requestsPerSecond?interval=PT30Sร—pan=PT5M
I0904 18:51:55.636965 1 trace.go:76] Trace[2095719105]: "List /apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/performanceCounters-requestsPerSecond" (started: 2018-09-04 18:51:43.63787808 +0000 UTC m=+308346.069863768) (total time: 11.999061806s):
Trace[2095719105]: [11.998899902s] [11.9988167s] Listing from storage done

Add support for GO SDK access to the App Insights via Service Principal

Currently, we access AppInsights with a token, which limits us to only working with a single application insights instance. It also requires additional configuration of secrets. It would be better to use the existing Service Principal that is configured and add appropriate permissions. I tried this according to the docs in AppInisghts but I am getting 401. This may just require documenting the correct configuration for the SP.

Additional information

I am using the GoMetrics client to access the App Insights API. I set up the Azure Authentication for a service principal as defined in the docs but I receive the following 403 error message:

bad request: insights.MetricsClient#Get: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code='InsufficientAccessError' Message='The provided credentials have insufficient access to perform the requested operation' InnerError={'code':'InvalidTokenError','message':'Could not validate the request. Challenge failed: SignatureVerificationFailed'} 

At first glance in the code everything seems to be correct with the authorization code gets the token from the AADEndpoint and that seems to align with the docs on auth endpoints.

Custom metrics sample not working

Hi,

I am trying to make work the sample about custom metrics on Request per second, but I cannot make it work...Also I do not understand the logs and application insights correlation.

So I have this error:

I0916 10:17:31.297948 1 metric_cache.go:57] metric not found CustomMetric/default/rps
I0916 10:17:31.297953 1 provider_custom.go:100] New call to GetCustomMetric: rps
I0916 10:17:31.297958 1 appinsights.go:110] Application insights key has been provided - using Application Insights REST API.
I0916 10:17:31.297995 1 appinsights.go:186] request to: https://api.applicationinsights.io/v1/apps/XXXXXXXXXXXXXXXXXXXX/metrics/**rps**?interval=PT30S&timespan=PT5M

What really I do not understand is that rps is the name for the K8s object not an Application Inisghts metric...Shouldn't it always fail? And be like the following request below?

curl "https://api.applicationinsights.io/v1/apps/XXXXXXXXXXXXXXXXXXXXXXX/metrics/**performanceCounters/requestsPerSecond**?timespan=PT5M&interval=PT30S&aggregation=sum" -H "x-api-key: tYYYYYYYYYYYYYYYYYYYYYY"

Thanks

Support Self-Managed Cluster

Hi,

The Readme says this adapter is meant to be used with AKS.
Will it work with a self managed cluster (V1.11.8) provisioned by the AKS-Engine?

If not - is it possible to add support?

Thank you!

http: TLS handshake error from tunnelfront

After starting up get the following logs:

[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:37.568548       1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.663683       1 serve.go:96] Serving securely on [::]:6443
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.671804       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59674: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.675718       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59676: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.679497       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59678: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.685118       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59680: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.690329       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59704: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.694252       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59706: EOF
[custom-metrics-azure-apiserver-8575cd9857-qpfz6 custom-metrics-azure-apiserver] I0724 22:01:48.699050       1 logs.go:49] http: TLS handshake error from 10.2.0.12:59710: EOF

It appears that aks tunnelfront is running on that pod and causing an error though it seems to not affect the application running

k get po --all-namespaces -o wide
NAMESPACE        NAME                               READY     STATUS    RESTARTS   AGE       IP          NODE
kube-system      tunnelfront-6bb99bfdc8-kjwf5       1/1       Running   13         41d       10.2.0.12   aks-agentpool-85932268-0

Custom metrics sample needs to be updated

Describe the bug
I wasn't able to get the custom metrics sample working. After some experimentation, I was able to figure out the way to use it.

Instead of deploying the custom metric:

kubectl apply -f deploy/custom-metric.yaml

We can now directly enter the metric in hpa yaml:

metrics:
- type: Pods
  pods:
    metric:
      name: performanceCounters-requestsPerSecond
    target:
      type: AverageValue
      averageValue: 10

You can also test out the api using:

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/performanceCounters-requestsPerSecond"

Error from server (ServiceUnavailable): the server is currently unable to handle the request

Describe the bug
I'm unable to even get a response from the external metrics API after following this document "https://github.com/Azure/azure-k8s-metrics-adapter/tree/master/samples/servicebus-queue"

To Reproduce
Follow this document step by step "https://github.com/Azure/azure-k8s-metrics-adapter/tree/master/samples/servicebus-queue"

Expected behavior
After the helm installation using service-principal when executing the command kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1" | jq I should be getting an output as suggested by the document. but instead I'm facing an error stating Error from server (ServiceUnavailable): the server is currently unable to handle the request

Kubernetes version (kubectl version):

  • Running on AKS
    Kubernetes Version: 1.14.8
    Logs (kubectl logs <metric adapter pod id>)
    The helm installation was successful and below are the logs from the adapter deployment (Pod)
    I0116 12:49:36.216094 1 controller.go:40] Setting up external metric event handlers I0116 12:49:36.216148 1 controller.go:52] Setting up custom metric event handlers I0116 12:49:36.216528 1 controller.go:69] initializing controller I0116 12:49:36.353905 1 main.go:104] Looking up subscription ID via instance metadata I0116 12:49:36.359887 1 instancemetadata.go:40] connected to sub: ********************* I0116 12:49:36.416858 1 controller.go:77] starting 2 workers with 1000000000 interval I0116 12:49:36.417062 1 controller.go:88] Worker starting I0116 12:49:36.417068 1 controller.go:88] Worker starting I0116 12:49:36.417074 1 controller.go:98] processing item I0116 12:49:36.417078 1 controller.go:98] processing item I0116 12:49:36.680065 1 serving.go:312] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key) I0116 12:49:37.197936 1 secure_serving.go:116] Serving securely on [::]:6443

Additional context
When I execute the command kubectl api-versions
external.metrics.k8s.io/v1beta1 is displayed. So this proves that the installation went successful. But why am I not able to hit api???

Move to config file for configuration

Currently all of the metadata required to pull the metrics from Azure are hosted on the HPA, for example on the external metrics HPA:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
 name: consumer-scaler
spec:
 scaleTargetRef:
   apiVersion: extensions/v1beta1
   kind: Deployment
   name: consumer
 minReplicas: 1
 maxReplicas: 10
 metrics:
  - type: External
    external:
      metricName: queuemessages
      metricSelector:
        matchLabels:
          metricName: Messages
          resourceGroup: sb-external-example
          resourceName: sb-external-ns
          resourceProviderNamespace: Microsoft.Servicebus
          resourceType: namespaces
          aggregation: Total
          filter: EntityName_eq_externalq
      targetValue: 30

This creates overhead for the end user especially if they were to try to map Odata queries to label selectors as I previously suggested.

Another interesting result of this the conversion is when using metrics names. With custom metrics the end user must convert the Application Insights metric name from performanceCounters/requestsPerSecond to performanceCounters-requestsPerSecond replacing the / to a -.

With external metrics the "metric name" is not used at all and instead pulled from the selector. This is becuase the metricname in the URL is lower cased and Azure monitor is case sensitive the the metric names, for example Service Bus Messages cannot be messages which happens when the Metric name is pass via url via hpa.

Proposal

To create a better experience for the developer I propose we use a config file that is load via configmap to hold this meta data. The Prometheus Adapter uses the idea of a config though I believe we don't need it to be as complex (yet). This would allow a few scenarios:

  • makes cognitive overhead for developer converting azure meta data simpler
  • a cluster administrator could restrict which metrics are available through the configuration file (if not listed in the file then would not be able to retrieve metric)
  • all for more complex configuration when writing queries

To help even further it would be possible to provide tooling that would auto generate this config file based on the service principal that has access to azure resources.

Example config

#optional sub id that will be used by all external metrics (unless overriden on metric definition)
subscirptionId: 12345678-1234-1234-1234-12345678901234 
#optional appid that will be used by all custom metrics (unless overriden on metric definition)
applicationId: 12345678-1234-1234-1234-123456789012

#list all external metrics values are obtained from https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics
external:
  - metric: queuemessages #this is the name that will be referenced by hpa 
    metricName: Messages #azure name - is Case sensitive
    resourceGroup: sb-external-example
    resourceName: sb-external-ns
    resourceProviderNamespace: Microsoft.Servicebus #this can container slashes (/)
    resourceType: namespaces
    aggregation: Total
    filter: EntityName eq 'externalq'  #any valid odata filter
    subscriptionID: 12345678-1234-1234-1234-12345678901234 #optional (override global)

#list all custom metrics
custom:
  - metric: rps # this is the name that will be refrenced by hpa
    metricName: performanceCounters/requestsPerSecond # azure name which containers slashes
    applicationId: 12345678-1234-1234-1234-123456789012 #optional (override global)
  # use with ai queries 
  - metric: requests # this is the name that will be refrenced by hpa
    # AI query as defined at https://docs.loganalytics.io/docs/Language-Reference/Query-statements
    query: requests | where timestamp > ago(30d) and client_City == "Redmond" | summarize clients = dcount(client_IP) by tod_UTC=bin(timestamp % 1d, 1h), resultCode | extend local_hour = (tod_UTC - 8h) % 24h
    applicationId: 12345678-1234-1234-1234-123456789012 #optional (override global)

Feedback

Please provide any feedback you might have an the above proposal. Thanks!

How to get the custom metric "requests-duration" which already exists. Could you please provide an example? The metric can be retrieved from application insights but not from the adapter.

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce

Expected behavior
A clear and concise description of what you expected to happen.

Kubernetes version (kubectl version):

  • Running on AKS

Logs (kubectl logs <metric adapter pod id>)

Additional context
Add any other context about the problem here. (HPA, Custom/External Metric CRD, azure auth type, etc.)

Provide logging for found metric value

Current logging looks like the following:

I0817 11:28:10.515057 1 provider.go:120] Received request for namespace: autoscaling-sandbox, metric name: queuemessages, metric selectors: aggregation=Total,filter=EntityName_eq_externalq,metricName=Messages,resourceGroup=promitor,resourceName=promitor-messaging,resourceProviderNamespace=Microsoft.Servicebus,resourceType=namespaces
I0817 11:28:10.515081 1 az-metric-client.go:159] begin parsing metric
I0817 11:28:10.515096 1 az-metric-client.go:196] aggregation: Total
I0817 11:28:10.515101 1 az-metric-client.go:201] filter: EntityName_eq_externalq
I0817 11:28:10.515106 1 az-metric-client.go:204] filter formatted: EntityName eq 'externalq'
I0817 11:28:10.515599 1 az-metric-client.go:181] metricName: Messages
I0817 11:28:10.515618 1 az-metric-client.go:184] resourceGroup: promitor
I0817 11:28:10.515624 1 az-metric-client.go:187] resourceName: promitor-messaging
I0817 11:28:10.515630 1 az-metric-client.go:190] resourceProviderNamespace: Microsoft.Servicebus
I0817 11:28:10.515635 1 az-metric-client.go:193] resourceType: namespaces
I0817 11:28:10.516645 1 az-metric-client.go:60] resource uri: /subscriptions/0f9d7fea-99e8-4768-8672-06a28514f77e/resourceGroups/promitor/providers/Microsoft.Servicebus/namespaces/promitor-messaging
I0817 11:28:10.516659 1 az-metric-client.go:61] filter: EntityName eq 'externalq'
I0817 11:28:10.516666 1 az-metric-client.go:62] metric name : Messages

It would be great if it would log the effective value that was found as well in case you'd want to troubleshoot why the HPA is not scaling.

Secret has wrong name in quick start

Describe the bug
Quick start samples create a secret called adapter-service-principal but the deployment in the adapter yaml file wants a secret called azure-k8s-metrics-adapter.

performanceCounters/requestsPerSecond for RPS Scaling

Hi Everyone,

The current RPS Scaling example in the project uses a linux container (jsturtevant/metric-rps-example), that scales by reading from the performanceCounters/requestsPerSecond metric in the Microsoft Application Insights API.

We've tested the default code in the repo using this container, and the adapter is able to read performanceCounters/requestsPerSecond metrics from the API and scale perfectly.

However, when we use our own linux containers (so far .NET Core [Debian Linux] and JAVA [openjdk:11-jre-slim]), no performanceCounters/requestsPerSecond metrics are registered in Application Insights. No matter what time span or interval combinations are used, the Application Insights API returns Null. (We know this by browsing the Application Insights API Explorer... https://dev.applicationinsights.io).

This baffles us, since the container provided with this example is also a linux container, and it is sending performanceCounters/requestsPerSecond metrics to App Insights without any issues.

  1. Is there anything special that has been done to this container to get it to send performanceCounters/requestsPerSecond data to Application Insights?
  2. For clarity, according to the default code in the samples, is it the Code inside the container that sends telemetry to Application Insights, or is it Kubernetes?

Unable to get external metrics by raw query or via HPA

Describe the bug
Unable to get external metrics from API , either via an HPA or by raw query.

My HPAs are reporting that they are unable to get the external metric with the following error:

unable to get external metric {ns}/{metricName}/nil: unable to fetch metrics from external metrics API: Unknown Azure external metric client type provided:

Ultimately I traced this message to here:

Which appears to be switching on the azMetricRequest.Type property sent from here. I didn't find where this property is set as it isn't in the ParseAzureMetric method (from what I see).

That said, I decided to try adding a type property to my ExternalMetric resource definition like so:

apiVersion: azure.com/v1alpha2
kind: ExternalMetric
metadata:
  name: {name}
spec:
  type: Monitor
  azure:
    resourceGroup: {rg}
    resourceName: {rn}
    resourceProviderNamespace: Microsoft.ServiceBus
    resourceType: namespaces
  metric:
    aggregation: Total
    filter: EntityName eq '{entity}'
    metricName: ActiveMessages

after which I received this message instead (raw query used here):

Error from server (BadRequest): Unknown Azure external metric client type provided: Monitor

I didn't see this approach documented anywhere if that's the intended method of specifying client type.

To Reproduce

  1. Deploy the latest custom metrics adapter by following the instructions with Azure AAD Pod Identity support.
  2. Create an ExternalMetric for a Service Bus (see above definition).
  3. Run a raw query against the metric, kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/{namespace}/{metricName}"

Expected behavior
Metrics are returned both by raw query and to my HPA, allowing HPAs to autoscale off of both Azure Monitor and Azure Service Bus metrics.

Kubernetes version (kubectl version): 1.16.2

  • Running on AKS

Logs (kubectl logs <metric adapter pod id>)

The metrics pod itself is working correctly in terms of the metrics themselves, from what I see:

I1116 14:41:49.590541       1 controller.go:134] succesfully proccessed item '{metricName}'
I1116 14:41:49.590586       1 controller.go:91] processing next item
I1116 14:41:49.590590       1 controller.go:98] processing item

I see no error messages in the pod log output.

Add ability to work outside of Azure vms

When provider is created, if no azure configuration then use environment variable for Subscription Id. This would allow for the adapter to be run on on-prem cluster and still connect to azure.

Currently, just loads azure config:

func NewAzureProvider(client dynamic.Interface, mapper apimeta.RESTMapper) provider.MetricsProvider {
	azureConfig, err := aim.GetAzureConfig()
	if err != nil {
		glog.Errorf("unable to get azure config: %v", err)
	}

To Do:

  • change config to be not specific to azure
  • if environment variable is available then use that for subscription id otherwise use azure configuration reader
  • update read.me and show how to do this (requires modifying the deployment manifests)
  • write test

Provide support for Azure AD Application authentication

Provide support for Azure AD Application authentication so that people are not forced to use MSI.
In certain enterprises this is not used (yet) for various reasons where the generated name for the AD App not aligning with naming convention is one.

This could be similar to how Promitor handles this is not a must.

Problem following along with custom metric walkthrough

Describe the bug
When following the walkthrough for custom metrics, I get Error: release sample-release failed: customresourcedefinitions.apiextensions.k8s.io "custommetrics.azure.com" already exists

To Reproduce
Follow walkthrough until the helm install step.

Expected behavior
Since this is a walkthrough I expect the defaults to work out of the box, or for the walkthrough to inform me of what needs to be changed. (My apologies if this is obvious, still learning all the ins & outs of k8s)

Kubernetes version (kubectl version):

  • Running on AKS
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Logs (kubectl logs <metric adapter pod id>)
See error.txt

Additional context
Following custom metrics walkthrough.

Please improve the speed of retrieving the count of Service Bus Queue messages

More exactly, this is for the "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages" metric.

As @jsturtevant confirmed in #54, the delay for the above metric to retrieve the actual count of messages in a Service Bus Message Queue can go up to 2 minutes. Since the HPA checks the count of message only every 30", it can take up to 2'30" for auto-scaling to start. The 2'30" delay is huge for our products' auto-scalability requirements, so it would be GREAT if it would be possible to make HPA's autoscaling way faster, hopefully to mere seconds over the 30" check interval.

The most recent version has difficulties reading the SB Queue metrics

I installed this morning the Adapter running kubectl apply -f https://raw.githubusercontent.com/Azure/azure-k8s-metrics-adapter/master/deploy/adapter.yaml

We've already used it for > 1 year, yet the output of kubectl describe hpa had never been as noisy as now. Please see the attached file: hpa.txt

I just completed 50 consecutive runs of kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages" and I'm seeing 11 failures, or a success rate of 78%. Please see in the attached file the output of that experiment:
verifyMessageQueueMetric.txt

I verified the other 3 AKS instances that we have in production - all running earlier versions of the
Adapter - and none of them shows any issues getting the information. Please see below the HPA output of the only AKS instance that experienced scaling activity recently:

image

I ran the .\verifyMessageQueueMetric.ps1 -iterations 50 experiment on that particular instance of AKS and completed with no errors.

Strange HPA behaviour with custom insights metric (pods don't scale down)

I am using v0.4.1 of custom metric adapter and the HPA is acting strangely.

Example my curren value is 1 and my targetaverage is 100k. Is should not scale up, but is should scale down.

This is a describe example:

Name:                                  wordpress-database-hpa
Namespace:                             default
Labels:                                <none>
Annotations:                           <none>
CreationTimestamp:                     Mon, 01 Oct 2018 23:02:45 +0200
Reference:                             StatefulSet/wordpress-database-mariadb-slave
Metrics:                               ( current / target )
  "custom-requestspersecond" on pods:  1 / 100k
Min replicas:                          1
Max replicas:                          5
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  the last scale time was sufficiently old as to warrant a new scale
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from pods metric custom-requestspersecond
  ScalingLimited  True    TooManyReplicas   the desired replica count is more than the maximum replica count
Events:           <none>

I'm faced with an interesting auto-scaling question

I have the solution in place to distribute computations to containers using a Service Bus queue and have HPA setup to scale on the metric below. It works very well, thanks very much for making it possible!

metric:
metricName: ActiveMessages
aggregation: Total
filter: EntityName eq 'requests'

The scenario that I'm faced with is that most computations complete under 2' while some take 40'. Their ratio seems to be 22:1 in favor of the short ones. The short requests come in sporadically, there is no way to predict when and how many will arrive.

The problem is that some of the long computations are ended prematurely - most likely because HPA terminates the container - causing them to be resumed on a different container, which makes their execution even longer. The customer is not happy with that!

Does anyone has some suggestions on how should I go about ensuring that the long computations complete on their first attempt? I searched and did not see any way for the container to tell HPA "leave me alone, I'm working". If I add CPU usage as a secondary scaling criteria, won't that drive the count of containers to max, even if it is only one single long running computation that's executing?

A different solution I imagine is to save the computation's state from time to time and whenever it is forced to resume in a different container, to start from the saved state. It's not a component I'm working on, so do not know if it is even possible.

Provide support for multiple subscriptions

Provide support for multiple subscriptions. This is important when the cluster is shared with multiple teams which have their own subscription, but share the cluster isolated into multiple namespaces.

Currently from what I've seen only the subscription from the AKS cluster is supported, is that correct?

Could we have the chart published to a helm repo?

Is your feature request related to a problem? Please describe.
Easier to pull the helm chart along side your updates rather than to maintain the chart ourselves, makes setting it as a dependency for other charts easier
Describe the solution you'd like
Have the chart in a helm repo like https://kubernetescharts.blob.core.windows.net/azure

Describe alternatives you've considered
Fork the chart and push it to an ACR and use the chart as a dependency

Additional context
Add any other context or screenshots about the feature request here.

Clean-up the setup documentation

Is your feature request related to a problem? Please describe.
Set up docs can get a user in the wrong state as seen in issue #54

Describe the solution you'd like
Clean getting started documentation.

Make 'request.Timespan' & 'request.Interval' Configurable

Please make the 'request.Timespan' & 'request.Interval' (found in azure-k8s-metrics-adapter/pkg/azure/appinsights/appinsights.go) configurable for end users. Through a helm install param or a yaml file or something like that.

I have a scenario where we are unable to collect metrics because the metrics interval specified is too small.

Error when attempting to install adapter into cluster

Describe the bug
I'm trying to deploy the adapter to use external metrics in my cluster but this is what it says:
"failed to download charts/azure-k8s-metrics-adapter"

To Reproduce
First have a cluster
kubectl create namespace custom-metrics
helm install --name my-release charts/azure-k8s-metrics-adapter --namespace custom-metrics

I made sure the helm repo is updated as well, but it seems to be pulling from the stable folder.
Also, in the readme: https://github.com/Azure/azure-k8s-metrics-adapter, you mention there is a helm chart if we use helm in our cluster but it redirects to a page not found.

Is there functionality to deploy the adapter and external metrics using helm?

Kubernetes version (kubectl version):

  • Running on AKS
    version 1.12

Thanks!

FailedComputeMetricsReplicas... unable to fetch metrics from external metrics API: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token...

Could someone please help me to do what is needed in order to get the Message Queue size auto-scaling working?

The complete error I'm seeing is below. The expected result would (probably) be to get a 0 value back since there's no message in the queue at this time.

kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages"
Error from server (BadRequest): azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com//subscriptions/d1151462-af84-4066-8ebc-62a4191a356e/resourceGroups/Eugen-Prototypes/providers/Microsoft.ServiceBus/namespaces/eugenservicebus/providers/microsoft.insights/metrics?%24filter=EntityName+eq+%27requestqueue%27&aggregation=Total&api-version=2018-01-01&metricnames=Messages&timespan=2019-01-04T19%3A20%3A10Z%2F2019-01-04T19%3A25%3A10Z: StatusCode=400 -- Original Error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}

Starting point

I'm using an Azure Kubernetes Service instance, 'kubectl version' returns:

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.5", GitCommit:"753b2dbc622f5cc417845f0ff8a77f539a4213ea", GitTreeState:"clean", BuildDate:"2018-11-26T14:31:35Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

The Metrics Server was already present in the AKS instance:

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
{"kind":"NodeMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/metrics.k8s.io/v1beta1/nodes"},"items":[{"metadata":{"name":"aks-agentpool-42482945-0","selfLink":"/apis/metrics.k8s.io/v1beta1/nodes/aks-agentpool-42482945-0","creationTimestamp":"2019-01-04T19:46:48Z"},"timestamp":"2019-01-04T19:46:00Z","window":"1m0s","usage":{"cpu":"111m","memory":"1447864Ki"}}]}

To set up auto-scaling

Following the steps at https://www.jamessturtevant.com/posts/Azure-Kubernetes-Metric-Adapter/ I setup the Metrics Adapter using the command:

kubectl apply -f https://raw.githubusercontent.com/Azure/azure-k8s-metrics-adapter/master/deploy/adapter.yaml

In the last step of the setup I used HPA.yaml file below:

kind: HorizontalPodAutoscaler
metadata:
 name: external-solver-runner-scaler
spec:
 scaleTargetRef:
   apiVersion: extensions/v1beta1
   kind: Deployment
   name: aks-aci-boldiq-external-solver-runner
 minReplicas: 1
 maxReplicas: 10
 metrics:
  - type: External
    external:
      metricName: queuemessages
      targetValue: 2

Following some of the steps at https://github.com/Azure/azure-k8s-metrics-adapter/tree/master/samples/servicebus-queue I set up:

az ad sp create-for-rbac -n "adapter-sp" --role "Monitoring Reader" --scopes /subscriptions/d1151462-af84-4066-8ebc-62a4191a356e/resourceGroups/Eugen-Prototypes
{
  "appId": "the_app_id",
  "displayName": "adapter-sp",
  "name": "http://adapter-sp",
  "password": "the_password",
  "tenant": "the_tenant"
}

kubectl create secret generic adapter-service-principal -n custom-metrics --from-literal=azure-tenant-id=the_tenant --from-literal=azure-client-id=the_app_id  --from-literal=azure-client-secret=the_password

And configured the External Metric:

apiVersion: azure.com/v1alpha1
kind: ExternalMetric
metadata:
  name: queuemessages
spec:
  azure:
    resourceGroup: Eugen-Prototypes
    resourceName: eugenservicebus
    resourceProviderNamespace: Microsoft.ServiceBus
    resourceType: namespaces
  metric:
    metricName: Messages
    aggregation: Total
    filter: EntityName eq 'requestqueue'

HPA not scaling down when target is met

Describe the bug
The HPA isn't removing replicas once the target value has been met.

To Reproduce
Steps to reproduce
Follow the walk through from https://github.com/Azure/azure-k8s-metrics-adapter/blob/master/samples/request-per-second/readme.md

Run kubectl get hpa {hpaname} -w
to watch the state change

Once HEY has finished processing and the cool down period has passed run
kubectl get hpa {hpaname} -w
you will notice the target drops to 0/{targetvalue} but the number of replicas doesn't reduce.

Expected behavior
The HPA should scale down to the min replicas

Kubernetes version (kubectl version):

  • [* ] Running on AKS
    Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"windows/amd64"}
    Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Logs (kubectl logs <metric adapter pod id>)
I1031 21:02:03.132657 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/customMetrics/queuelengthmonitoredqueue?interval=PT30S&amp;timespan=PT5M
I1031 21:02:03.285583 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: performanceCounters-requestsPerSecond, selectors: app=rps-sample
I1031 21:02:03.285628 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/performanceCounters/requestsPerSecond?interval=PT30S&amp;timespan=PT5M
I1031 21:03:03.167579 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: customMetrics-queuelengthmonitoredqueue, selectors: app=queue-sample
I1031 21:03:03.167676 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/customMetrics/queuelengthmonitoredqueue?interval=PT30S&amp;timespan=PT5M
I1031 21:03:03.319472 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: performanceCounters-requestsPerSecond, selectors: app=rps-sample
I1031 21:03:03.319516 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/performanceCounters/requestsPerSecond?interval=PT30S&amp;timespan=PT5M
I1031 21:03:33.171531 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: customMetrics-queuelengthmonitoredqueue, selectors: app=queue-sample
I1031 21:03:33.171576 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/customMetrics/queuelengthmonitoredqueue?interval=PT30S&amp;timespan=PT5M
I1031 21:03:33.280357 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: performanceCounters-requestsPerSecond, selectors: app=rps-sample
I1031 21:03:33.280397 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/performanceCounters/requestsPerSecond?interval=PT30S&amp;timespan=PT5M
I1031 21:04:03.160889 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: performanceCounters-requestsPerSecond, selectors: app=rps-sample
I1031 21:04:03.160929 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/performanceCounters/requestsPerSecond?interval=PT30S&amp;timespan=PT5M
I1031 21:04:03.264824 1 provider.go:65] Received request for custom metric: groupresource: pods, namespace: default, metric name: customMetrics-queuelengthmonitoredqueue, selectors: app=queue-sample
I1031 21:04:03.264864 1 aiapiclient.go:52] request to: https://api.applicationinsights.io/v1/apps/{ai-appid}/metrics/customMetrics/queuelengthmonitoredqueue?interval=PT30S&amp;timespan=PT5M

Unable to get the queue value returns from external metrics

Describe the bug
I have followed the example of how to scale using Service Bus Queue as an external metric. But I was unable to get the queue value by running the following command:

kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages"
Error from server (BadRequest): metricName is required

The external metric API seems to work well.

kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/"
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"external.metrics.k8s.io/v1beta1","resources":[]}

I appreciate for your kind assistance. Thank you so much!

To Reproduce

  1. Create a AKS cluster through Azure Portal

    • Kubernetes version: 1.13.7
    • Enable RBAC: Yes
    • Network configuration: Basic
    • Enable container monitoring: Yes
  2. Validate the Metric Server by running kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" command.

  3. Install Helm on AKS

  4. Create a service bus in Azure

    • Create a namespace
    • Create a queue
    • Create an auth rules for queue:
  5. Enable Access to Azure Resources by configuring a service principal

    • Create a service principal scoped to the resource group the resource you monitoring and assign Monitoring Reader to it:
    az ad sp create-for-rbac -n "adapter-sp" --role "Monitoring Reader" --scopes /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}
    
    • Deploy the environment variables via secret:
    kubectl create secret generic adapter-service-principal -n custom-metrics --from-literal=azure-tenant-id=<tenantid> --from-literal=azure-client-id=<clientid>  --from-literal=azure-client-secret=<secret>
    
  6. Since I don't have the go language environment, I used the Service Bus Explorer to create messages as producer.

  7. Configure Secret for consumer pod

  8. Deploy Consumer

    • Deploy the consumer:
    kubectl apply -f deploy/consumer-deployment.yaml
    
    • Check that the consumer was able to receive messages:
     kubectl logs consumer-547467bf87-bhdnt
     connecting to queue:  externalq
     setting up listener
     received message:  <?xml version="1.0" encoding="utf-8"?>
     <message>Hi mate, how are you?</message>
    
  9. Deploy the adapter

    • Create a namespace
    kubectl create namespace custom-metrics
    
    • Create a Service Principle and Secret:
    az ad sp create-for-rbac -n "azure-k8s-metric-adapter-sp" --role "Monitoring Reader" --scopes /subscriptions/{SubID}/resourceGroups/{ResourceGroup1}
    
    • Use values from service principle created above to create secret
    kubectl create secret generic azure-k8s-metrics-adapter -n custom-metrics --from-literal=azure-tenant-id=<tenantid> --from-literal=azure-client-id=<clientid>  --from-literal=azure-client-secret=<secret>
    
    • Deploy the adapter with the service principle:
    helm install --name sample-release ../../charts/azure-k8s-metrics-adapter --namespace custom-metrics --set azureAuthentication.method=clientSecret --set azureAuthentication.tenantID=<your tenantid> --set azureAuthentication.clientID=<your clientID> --set azureAuthentication.clientSecret=<your clientSecret> --set azureAuthentication.createSecret=true`
    
    • I can hit the external metric endpoint by verifying with the following command
    kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1
    {
      "kind": "APIResourceList",
      "apiVersion": "v1",
      "groupVersion": "external.metrics.k8s.io/v1beta1",
      "resources": []
    }
    
  10. Configure Metric Adapter with metrics

    • Replace the resourceGroup and resourceName in externalmetric.yaml
    • Create ExternalMetric resource with following command
    kubectl apply -f deploy/externalmetric.yaml
    
    • List external metrics via following command
    kubectl get aem
    NAME            AGE
    queuemessages   40s
    
  11. Deploy the HPA

    • Deploy the HPA:
    kubectl apply -f deploy/hpa.yaml
    
    • Validate that the HPA is configured
    kubectl get hpa consumer-scaler
    NAME              REFERENCE                    TARGETS        MINPODS   MAXPODS   REPLICAS   AGE
    consumer-scaler   Deployment/consumer-scaler   <unknown>/30   1         10        1          3d2h
    
    • You can also check the queue value returns manually by running:
    kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages"
    Error from server (BadRequest): metricName is required
    

Expected behavior
I can get the queue value by running the following command:

kubectl  get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/queuemessages"

Kubernetes version (kubectl version):

  • [1.13.7] Running on AKS

Logs (kubectl logs <metric adapter pod id>)
kubectl logs sample-release-azure-k8s-metrics-adapter-69556544c9-c7pt6

E0728 12:53:52.320936       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.ExternalMetric: the server could not find the requested resource (get external
metrics.azure.com)
E0728 12:53:52.320936       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.CustomMetric: the server could not find the requested resource (get custommetr
ics.azure.com)
E0728 12:53:53.322735       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.ExternalMetric: the server could not find the requested resource (get external
metrics.azure.com)
E0728 12:53:53.323570       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.CustomMetric: the server could not find the requested resource (get custommetr
ics.azure.com)
E0728 12:53:54.324518       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.ExternalMetric: the server could not find the requested resource (get external
metrics.azure.com)
E0728 12:53:54.325311       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.CustomMetric: the server could not find the requested resource (get custommetr
ics.azure.com)
E0728 12:53:55.326181       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.ExternalMetric: the server could not find the requested resource (get external
metrics.azure.com)
E0728 12:53:55.327096       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.CustomMetric: the server could not find the requested resource (get custommetr
ics.azure.com)
E0728 12:53:56.335038       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.ExternalMetric: the server could not find the requested resource (get external
metrics.azure.com)
E0728 12:53:56.335947       1 reflector.go:205] github.com/Azure/azure-k8s-metrics-adapter/pkg/client/informers/externalversions/factory.go:117: Failed to list *v1alpha1.CustomMetric: the server could not find the requested resource (get custommetr
ics.azure.com)

Additional context
kubectl describe hpa consumer-scaler

kubectl de
scribe hpa consumer-scaler
Name:                              consumer-scaler
Namespace:                         default
Labels:                            <none>
Annotations:                       kubectl.kubernetes.io/last-applied-configuration:
                                     {"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"consumer-scaler","namespace":"de...
CreationTimestamp:                 Sun, 28 Jul 2019 22:34:45 +0800
Reference:                         Deployment/consumer
Metrics:                           ( current / target )
  "queuemessages" (target value):  <unknown> / 30
Min replicas:                      1
Max replicas:                      10
Deployment pods:                   1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric default/queuemessages/nil: unable to fetch metrics from external metrics API: metricName is required
Events:
  Type     Reason                        Age                     From                       Message
  ----     ------                        ----                    ----                       -------
  Warning  FailedComputeMetricsReplicas  6m33s (x12 over 9m19s)  horizontal-pod-autoscaler  failed to get external metric queuemessages: unable to get external metric default/queuemessages/nil: unable to fetch metrics from external metrics API: metricName is required
  Warning  FailedGetExternalMetric       4m16s (x21 over 9m19s)  horizontal-pod-autoscaler  unable to get external metric default/queuemessages/nil: unable to fetch metrics from external metrics API: metricName is required

Add e2e tests

Add end to end tests that can be run in circleci workflow (and in container on local machine).

At the least want to deploy adapter, point it at Service Bus Queue and App Insights instance that has already been provisioned for this e2e test (the test would not be responsible for creating these resources in azure but should pass those values via environment variables). The test would:

  • call validate the adapter is installed properly (kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" & kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1")
  • install ExternalMetric CRD configured for Service Bus Queue and then make request to kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/test/queuemessages to get value and compare to az servicebus queue show --resource-group sb-external-example --namespace-name $SERVICEBUS_NS --name externalq -o json | jq .messageCount
  • install CustomMetric CRD configured for App Insights and then make request to kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/test/pods/*/custom-metric" and make sure value is returning.

Optionally could test the whole HPA scaling flow but this much more complicated. The above should give some basic smoke tests to make sure things are working generally.

I started the a possible e2e script here: https://github.com/Azure/azure-k8s-metrics-adapter/tree/e2e. Using the samples (service bus and appinsights) could help guide the tests

Error when deploying with Azure AD Pod Identity using Helm chart

Describe the bug
When using the Helm Chart to deploy with Azure AD Pod Identity, the AZURE_TENANT_ID and AZURE_CLIENT_ID are still injected in the deployment, but the secret is not generated (and those values are not needed).

To Reproduce
Deploy the Helm Chart with Azure AD Pod Identity authentication.

Kubernetes version (kubectl version):

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T18:02:47Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.8", GitCommit:"7eab6a49736cc7b01869a15f9f05dc5b49efb9fc", GitTreeState:"clean", BuildDate:"2018-09-14T15:54:20Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • [*] Running on AKS (but not related)

FailedGetExternalMetric event with reason The subscription 'resourceGroups' could not be found

I was testing the helm chart with a sample scenario based on this sample when I noticed that the HPA was not able to fetch the latest metrics:

ScalingActive False FailedGetExternalMetric the HPA was unable to compute the replica count: unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_externalq,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."

Am I doing something wrong here? I have a feeling that parameters are misinterpreted.

Helm install command

helm install --name azure-metrics-adapter --namespace autoscaling-sandbox --set azureAuthentication.method=clientSecret --set azureAuthentication.tenantID=<removed> --set azureAuthentication.clientID=<removed> --set azureAuthentication.clientSecret=<removed> --set defaultSubscriptionId=<removed> .\azure-k8s-metrics-adapter\

HPA Configuration

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
 name: consumer-scaler
spec:
 scaleTargetRef:
   apiVersion: extensions/v1beta1
   kind: Deployment
   name: consumer
 minReplicas: 1
 maxReplicas: 10
 metrics:
  - type: External
    external:
      metricName: queuemessages
      metricSelector:
        matchLabels:
          metricName: Messages
          resourceGroup: promitor
          resourceName: promitor-messaging
          resourceProviderNamespace: Microsoft.Servicebus
          resourceType: namespaces
          aggregation: Total
          filter: EntityName_eq_externalq
      targetValue: 30

Service Bus Configuration

Namespace lives in the promitor namespace:
image

Azure AD Configuration

Application has Monitor Reader permissions which is enough for Azure Monitor.

Kubernetes Logs

โฏ kubectl describe hpa --namespace autoscaling-sandbox consumer-scaler
Name:                              consumer-scaler
Namespace:                         autoscaling-sandbox
Labels:                            <none>
Annotations:                       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"consumer-scaler","namespace":"autoscaling-san...
CreationTimestamp:                 Thu, 16 Aug 2018 17:03:18 +0200
Reference:                         Deployment/consumer
Metrics:                           ( current / target )
  "queuemessages" (target value):  <unknown> / 30
Min replicas:                      1
Max replicas:                      10
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_externalq,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."
Events:
  Type     Reason                        Age              From                       Message
  ----     ------                        ----             ----                       -------
  Warning  FailedGetExternalMetric       4m (x3 over 5m)  horizontal-pod-autoscaler  unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_orders,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."
  Warning  FailedComputeMetricsReplicas  4m (x3 over 5m)  horizontal-pod-autoscaler  failed to get external metric queuemessages: unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_orders,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."
  Warning  FailedGetExternalMetric       4s (x9 over 4m)  horizontal-pod-autoscaler  unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_externalq,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."
  Warning  FailedComputeMetricsReplicas  4s (x9 over 4m)  horizontal-pod-autoscaler  failed to get external metric queuemessages: unable to get external metric autoscaling-sandbox/queuemessages/&LabelSelector{MatchLabels:map[string]string{aggregation: Total,filter: EntityName_eq_externalq,metricName: Messages,resourceGroup: promitor,resourceName: promitor-messaging,resourceProviderNamespace: Microsoft.Servicebus,resourceType: namespaces,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: insights.MetricsClient#List: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code="SubscriptionNotFound" Message="The subscription 'resourceGroups' could not be found."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.