Code Monkey home page Code Monkey logo

linode-cloud-controller-manager's Introduction

Kubernetes Cloud Controller Manager for Linode

Go Report Card Continuous Integration Coverage Status Docker Pulls

Twitter

The purpose of the CCM

The Linode Cloud Controller Manager (CCM) creates a fully supported Kubernetes experience on Linode.

  • Load balancers, Linode NodeBalancers, are automatically deployed when a Kubernetes Service of type "LoadBalancer" is deployed. This is the most reliable way to allow services running in your cluster to be reachable from the Internet.
  • Linode hostnames and network addresses (private/public IPs) are automatically associated with their corresponding Kubernetes resources, forming the basis for a variety of Kubernetes features.
  • Nodes resources are put into the correct state when Linodes are shut down, allowing pods to be appropriately rescheduled.
  • Nodes are annotated with the Linode region, which is the basis for scheduling based on failure domains.

Kubernetes Supported Versions

Kubernetes 1.9+

Usage

LoadBalancer Services

Kubernetes Services of type LoadBalancer will be served through a Linode NodeBalancer by default which the Cloud Controller Manager will provision on demand. For general feature and usage notes, refer to the Getting Started with Linode NodeBalancers guide.

Using IP Sharing instead of NodeBalancers

Alternatively, the Linode CCM can integrate with Cilium's BGP Control Plane to perform load-balancing via IP sharing on labeled Nodes. This option does not create a backing NodeBalancer and instead provisions a new IP on an ip-holder Nanode to share for the desired region. See Shared IP LoadBalancing.

Annotations

The Linode CCM accepts several annotations which affect the properties of the underlying NodeBalancer deployment.

All of the Service annotation names listed below have been shortened for readability. The values, such as http, are case-sensitive.

Each Service annotation MUST be prefixed with:
service.beta.kubernetes.io/linode-loadbalancer-

Annotation (Suffix) Values Default Description
throttle 0-20 (0 to disable) 0 Client Connection Throttle, which limits the number of subsequent new connections per second from the same client IP
default-protocol tcp, http, https tcp This annotation is used to specify the default protocol for Linode NodeBalancer.
default-proxy-protocol none, v1, v2 none Specifies whether to use a version of Proxy Protocol on the underlying NodeBalancer.
port-* json (e.g. { "tls-secret-name": "prod-app-tls", "protocol": "https", "proxy-protocol": "v2"}) Specifies port specific NodeBalancer configuration. See Port Specific Configuration. * is the port being configured, e.g. linode-loadbalancer-port-443
check-type none, connection, http, http_body The type of health check to perform against back-ends to ensure they are serving requests
check-path string The URL path to check on each back-end during health checks
check-body string Text which must be present in the response body to pass the NodeBalancer health check
check-interval int Duration, in seconds, to wait between health checks
check-timeout int (1-30) Duration, in seconds, to wait for a health check to succeed before considering it a failure
check-attempts int (1-30) Number of health check failures necessary to remove a back-end from the service
check-passive bool false When true, 5xx status codes will cause the health check to fail
preserve bool false When true, deleting a LoadBalancer service does not delete the underlying NodeBalancer. This will also prevent deletion of the former LoadBalancer when another one is specified with the nodebalancer-id annotation.
nodebalancer-id string The ID of the NodeBalancer to front the service. When not specified, a new NodeBalancer will be created. This can be configured on service creation or patching
hostname-only-ingress bool false When true, the LoadBalancerStatus for the service will only contain the Hostname. This is useful for bypassing kube-proxy's rerouting of in-cluster requests originally intended for the external LoadBalancer to the service's constituent pod IPs.
tags string A comma seperated list of tags to be applied to the createad NodeBalancer instance
firewall-id string An existing Cloud Firewall ID to be attached to the NodeBalancer instance. See Firewalls.
firewall-acl string The Firewall rules to be applied to the NodeBalancer. Adding this annotation creates a new CCM managed Linode CloudFirewall instance. See Firewalls.

Deprecated Annotations

These annotations are deprecated, and will be removed in a future release.

Annotation (Suffix) Values Default Description Scheduled Removal
proxy-protcol none, v1, v2 none Specifies whether to use a version of Proxy Protocol on the underlying NodeBalancer Q4 2021

Annotation bool values

For annotations with bool value types, "1", "t", "T", "True", "true" and "True" are valid string representations of true. Any other values will be interpreted as false. For more details, see strconv.ParseBool.

Port Specific Configuration

These configuration options can be specified via the port-* annotation, encoded in JSON.

Key Values Default Description
protocol tcp, http, https tcp Specifies protocol of the NodeBalancer port. Overwrites default-protocol.
proxy-protocol none, v1, v2 none Specifies whether to use a version of Proxy Protocol on the underlying NodeBalancer. Overwrites default-proxy-protocol.
tls-secret-name string Specifies a secret to use for TLS. The secret type should be kubernetes.io/tls.

Shared IP Load-Balancing

NOTE: This feature requires contacting Customer Support to enable provisioning additional IPs.

Services of type: LoadBalancer can receive an external IP not backed by a NodeBalancer if --bgp-node-selector is set on the Linode CCM and --load-balancer-type is set to cilium-bgp. Additionally, the LINODE_URL environment variable in the linode CCM needs to be set to "https://api.linode.com/v4beta" for IP sharing to work.

This feature requires the Kubernetes cluster to be using Cilium as the CNI with the bgp-control-plane feature enabled.

Example Daemonset configuration:
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ccm-linode
  namespace: kube-system
spec:
  template:
    spec:
      containers:
        - image: linode/linode-cloud-controller-manager:latest
          name: ccm-linode
          env:
            - name: LINODE_URL
              value: https://api.linode.com/v4beta
          args:
          - --bgp-node-selector=cilium-bgp-peering=true
          - --load-balancer-type=cilium-bgp
...
Example Helm chart configuration:
sharedIPLoadBalancing:
  loadBalancerType: cilium-bgp
  bgpNodeSelector: cilium-bgp-peering=true

Firewalls

Firewall rules can be applied to the CCM Managed NodeBalancers in two distinct ways.

CCM Managed Firewall

To use this feature, ensure that the linode api token used with the ccm has the add_firewalls grant.

The CCM accepts firewall ACLs in json form. The ACL can either be an allowList or a denyList. Supplying both is not supported. Supplying neither is not supported. The allowList sets up a CloudFirewall that ACCEPTs traffic only from the specified IPs/CIDRs and DROPs everything else. The denyList sets up a CloudFirewall that DROPs traffic only from the specified IPs/CIDRs and ACCEPTs everything else. Ports are automatically inferred from the service configuration.

See Firewall rules for more details on how to specify the IPs/CIDRs

Example usage of an ACL to allow traffic from a specific set of addresses

kind: Service
apiVersion: v1
metadata:
  name: https-lb
  annotations:
    service.beta.kubernetes.io/linode-loadbalancer-firewall-acl: |
      {
        "allowList": {
          "ipv4": ["192.166.0.0/16", "172.23.41.0/24"],
          "ipv6": ["2001:DB8::/128"]
        }
      }
spec:
  type: LoadBalancer
  selector:
    app: nginx-https-example
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: http
    - name: https
      protocol: TCP
      port: 443
      targetPort: https
User Managed Firewall

Users can create CloudFirewall instances, supply their own rules and attach them to the NodeBalancer. To do so, set the service.beta.kubernetes.io/linode-loadbalancer-firewall-id annotation to the ID of the cloud firewall. The CCM does not manage the lifecycle of the CloudFirewall Instance in this case. Users are responsible for ensuring the policies are correct.

Note
If the user supplies a firewall-id, and later switches to using an ACL, the CCM will take over the CloudFirewall Instance. To avoid this, delete the service, and re-create it so the original CloudFirewall is left undisturbed.

Routes

When running k8s clusters within VPC, node specific podCIDRs need to be allowed on the VPC interface. Linode CCM comes with route-controller functionality which can be enabled for automatically adding/deleting routes on VPC interfaces. When installing CCM with helm, make sure to specify routeController settings.

Example usage in values.yaml
routeController:
  vpcName: <name of VPC>
  clusterCIDR: 10.0.0.0/8
  configureCloudRoutes: true

Nodes

Kubernetes Nodes can be configured with the following annotations.

Each Node annotation MUST be prefixed with:
node.k8s.linode.com/

Key Values Default Description
private-ip IPv4 none Specifies the Linode Private IP overriding default detection of the Node InternalIP.
When using a VLAN or VPC, the Node InternalIP may not be a Linode Private IP as required for NodeBalancers and should be specified.

Example usage

kind: Service
apiVersion: v1
metadata:
  name: https-lb
  annotations:
    service.beta.kubernetes.io/linode-loadbalancer-throttle: "4"
    service.beta.kubernetes.io/linode-loadbalancer-default-protocol: "http"
    service.beta.kubernetes.io/linode-loadbalancer-port-443: |
      {
        "tls-secret-name": "example-secret",
        "protocol": "https"
      }
spec:
  type: LoadBalancer
  selector:
    app: nginx-https-example
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: http
    - name: https
      protocol: TCP
      port: 443
      targetPort: https

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-https-deployment
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-https-example
  template:
    metadata:
      labels:
        app: nginx-https-example
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
          - name: http
            containerPort: 80
            protocol: TCP
          - name: https
            containerPort: 80
            protocol: TCP

See more in the examples directory

Why stickiness and algorithm annotations don't exist

As kube-proxy will simply double-hop the traffic to a random backend Pod anyway, it doesn't matter which backend Node traffic is forwarded-to for the sake of session stickiness. These annotations are not necessary to implement session stickiness, as kube-proxy will simply double-hop the packets to a random backend Pod. It would not make a difference to set a backend Node that would receive the network traffic in an attempt to set session stickiness.

How to use sessionAffinity

In Kubernetes, sessionAffinity refers to a mechanism that allows a client always to be redirected to the same pod when the client hits a service.

To enable sessionAffinity service.spec.sessionAffinity field must be set to ClientIP as the following service yaml:

apiVersion: v1
kind: Service
metadata:
  name: wordpress-lsmnl-wordpress
  namespace: wordpress-lsmnl
  labels:
    app: wordpress-lsmnl-wordpress
spec:
  type: LoadBalancer
  selector:
    app: wordpress-lsmnl-wordpress
  sessionAffinity: ClientIP

The max session sticky time can be set by setting the field service.spec.sessionAffinityConfig.clientIP.timeoutSeconds as below:

sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 100

Generating a Manifest for Deployment

Use the script located at ./deploy/generate-manifest.sh to generate a self-contained deployment manifest for the Linode CCM. Two arguments are required.

The first argument must be a Linode APIv4 Personal Access Token with all permissions. (https://cloud.linode.com/profile/tokens)

The second argument must be a Linode region. (https://api.linode.com/v4/regions)

Example:

./deploy/generate-manifest.sh $LINODE_API_TOKEN us-east

This will create a file ccm-linode.yaml which you can use to deploy the CCM.

kubectl apply -f ccm-linode.yaml

Note: Your kubelets, controller-manager, and apiserver must be started with --cloud-provider=external as noted in the following documentation.

Deployment Through Helm Chart

LINODE_API_TOKEN must be a Linode APIv4 Personal Access Token with all permissions.

REGION must be a Linode region.

Install the ccm-linode repo

helm repo add ccm-linode https://linode.github.io/linode-cloud-controller-manager/ 
helm repo update ccm-linode

To deploy ccm-linode. Run the following command:

export VERSION=v0.3.22
export LINODE_API_TOKEN=<linodeapitoken>
export REGION=<linoderegion>
helm install ccm-linode --set apiToken=$LINODE_API_TOKEN,region=$REGION ccm-linode/ccm-linode

See helm install for command documentation.

To uninstall ccm-linode from kubernetes cluster. Run the following command:

helm uninstall ccm-linode

See helm uninstall for command documentation.

To upgrade when new changes are made to the helm chart. Run the following command:

export VERSION=v0.3.22
export LINODE_API_TOKEN=<linodeapitoken>
export REGION=<linoderegion>

helm upgrade ccm-linode --install --set apiToken=$LINODE_API_TOKEN,region=$REGION ccm-linode/ccm-linode

See helm upgrade for command documentation.

Configurations

There are other variables that can be set to a different value. For list of all the modifiable variables/values, take a look at './deploy/chart/values.yaml'.

Values can be set/overrided by using the '--set var=value,...' flag or by passing in a custom-values.yaml using '-f custom-values.yaml'.

Recommendation: Use custom-values.yaml to override the variables to avoid any errors with template rendering

Upstream Documentation Including Deployment Instructions

Kubernetes Cloud Controller Manager.

Upstream Developer Documentation

Developing a Cloud Controller Manager.

Development Guide

Building the Linode Cloud Controller Manager

Some of the Linode Cloud Controller Manager development helper scripts rely on a fairly up-to-date GNU tools environment, so most recent Linux distros should work just fine out-of-the-box.

Setup Go

The Linode Cloud Controller Manager is written in Google's Go programming language. Currently, the Linode Cloud Controller Manager is developed and tested on Go 1.8.3. If you haven't set up a Go development environment, please follow these instructions to install Go.

On macOS, Homebrew has a nice package

brew install golang

Download Source

go get github.com/linode/linode-cloud-controller-manager
cd $(go env GOPATH)/src/github.com/linode/linode-cloud-controller-manager

Install Dev tools

To install various dev tools for Pharm Controller Manager, run the following command:

./hack/builddeps.sh

Build Binary

Use the following Make targets to build and run a local binary

$ make build
$ make run
# You can also run the binary directly to pass additional args
$ dist/linode-cloud-controller-manager

Dependency management

Linode Cloud Controller Manager uses Go Modules to manage dependencies. If you want to update/add dependencies, run:

go mod tidy

Building Docker images

To build and push a Docker image, use the following make targets.

# Set the repo/image:tag with the TAG environment variable
# Then run the docker-build make target
$ IMG=linode/linode-cloud-controller-manager:canary make docker-build

# Push Image
$ IMG=linode/linode-cloud-controller-manager:canary make docker-push

Then, to run the image

docker run -ti linode/linode-cloud-controller-manager:canary

Contribution Guidelines

Want to improve the linode-cloud-controller-manager? Please start here.

Join the Kubernetes Community

For general help or discussion, join us in #linode on the Kubernetes Slack. To sign up, use the Kubernetes Slack inviter.

linode-cloud-controller-manager's People

Contributors

0xch4z avatar akaokunc avatar asauber avatar ashleydumaine avatar cliedeman avatar displague avatar eljohnson92 avatar frenchtoasters avatar jawshua avatar jnschaeffer avatar lbgarber avatar luthermonson avatar mhmxs avatar michkov avatar okokes-akamai avatar phillc avatar rahulait avatar rammanoj avatar rl0nergan avatar sanjid133 avatar sarahelizgray avatar scmeggitt avatar shanduur avatar sibucan avatar srust avatar tamalsaha avatar tchinmai7 avatar thorn3r avatar wbh1 avatar zliang-akamai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

linode-cloud-controller-manager's Issues

CCM does not integrate with a K3s cluster

General:

  • [Yes ] Have you removed all sensitive information, including but not limited to access keys and passwords?
  • [ Yes] Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Bug Reporting

Linode CCM does not get integrated with the K3s cluster.

Steps to reproduce the problem

Provision a K3s master on Ubuntu 18.04 Linode with the following:

  1. K3s master with the following agruments:
export KUBELET_EXTRA_ARGS=--cloud-provideer=external
export KUBELET_EXTRA_ARGS=--provider-id=linode://$20216373
export K3S_KUBECONFIG_MODE="644"
export INSTALL_K3S_EXEC=" --no-deploy servicelb --disable-cloud-controller "
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
curl -sfL https://get.k3s.io | sh -

  1. Run the following : kubectl get pods --namespace kube-system
    Output is:
NAME                                      READY   STATUS      RESTARTS   AGE
local-path-provisioner-58fb86bdfd-855vg   1/1     Running     0          15h
metrics-server-6d684c7b5-wv5s8            1/1     Running     0          15h
helm-install-traefik-hnbqq                0/1     Completed   0          15h
coredns-6c6bb68b64-qhpn4                  1/1     Running     0          15h
traefik-7b8b884c8-9wnz6                   1/1     Running     0          15h

  1. Deploy ccm-linode.yaml using: kubectl apply -f ccm-linode.yaml
    output is simialr to:
secret/ccm-linode created
serviceaccount/ccm-linode created
clusterrolebinding.rbac.authorization.k8s.io/system:ccm-linode created
daemonset.apps/ccm-linode created

  1. Deploy a K3s agent with the following:
export K3S_KUBECONFIG_MODE="644"
export K3S_URL="https://45.79.120.176:6443"
export K3S_TOKEN="K1007cc0062e2e7d06dabf7365d59a016043a8aea9677ebee9e4b433b19efde31ef::server:89baaaf06ce09843905f2dcb81ca8852"
export KUBELET_EXTRA_ARGS=--cloud-provideer=external
export KUBELET_EXTRA_ARGS=--provider-id=linode://$20216573
  1. Deploy drupal, and check the service using: kubectl get services drupal
    output is similar to:
NAME     TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
drupal   LoadBalancer   10.0.0.89      <pending>       8081:31809/TCP   33m

Expected Behavior

The output should be similar to:

NAME     TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)        AGE
drupal   LoadBalancer   10.0.0.89      192.0.2.0       8081:31809/TCP   33m

and a nodebalancer should be created in the cloud manager.

Actual Behavior

Environment Specifications

20216373 and 20216573 are the Linode ids for the master and agent

Screenshots, Code Blocks, and Logs

Additional Notes


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Manifest generation script fails with long API tokens

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Bug Reporting

If Linode API token is longer than 58 characters base64 encoded string includes carriage returns which fails the deploy/generate-manifest.sh script

Expected Behavior

deploy/generate-manifest.sh runs without errors and creates correctly the file ccm-linode.yaml

Actual Behavior

The script fails with the following error:
sed: -e expression #1, char 104: unterminated `s' command

and the ccm-linode.yaml generated file is empty

Steps to Reproduce the Problem

  1. export LINODE_API_TOKEN=some_long_string
  2. ./deploy/generate-manifest.sh $LINODE_API_TOKEN us-central
  3. The script fails

Environment Specifications

Screenshots, Code Blocks, and Logs

Additional Notes


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

ccm-linode doesnt start properly on k8s v1.19.3

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?
    Hello!
    I currently try manually deploy k8s cluster on Linode VMs and stuck on deploying linode cloud controller manager
    following the manual on Github. ccm-linode pods start with status "Running" but in pod logs a lot of errors:
Linode Cloud Controller Manager starting up
SENTRY_DSN not set, not initializing Sentry
ERROR: logging before flag.Parse: I1126 16:07:37.724777       1 flags.go:27] FLAG: --address="0.0.0.0"
ERROR: logging before flag.Parse: I1126 16:07:37.724881       1 flags.go:27] FLAG: --allocate-node-cidrs="false"
ERROR: logging before flag.Parse: I1126 16:07:37.724890       1 flags.go:27] FLAG: --allow-untagged-cloud="false"
ERROR: logging before flag.Parse: I1126 16:07:37.724896       1 flags.go:27] FLAG: --alsologtostderr="false"
ERROR: logging before flag.Parse: I1126 16:07:37.724902       1 flags.go:27] FLAG: --bind-address="0.0.0.0"
ERROR: logging before flag.Parse: I1126 16:07:37.724908       1 flags.go:27] FLAG: --cert-dir="/var/run/kubernetes"
ERROR: logging before flag.Parse: I1126 16:07:37.724915       1 flags.go:27] FLAG: --cidr-allocator-type="RangeAllocator"
ERROR: logging before flag.Parse: I1126 16:07:37.724921       1 flags.go:27] FLAG: --cloud-config=""
ERROR: logging before flag.Parse: I1126 16:07:37.724926       1 flags.go:27] FLAG: --cloud-provider="linode"
ERROR: logging before flag.Parse: I1126 16:07:37.724932       1 flags.go:27] FLAG: --cluster-cidr=""
ERROR: logging before flag.Parse: I1126 16:07:37.724936       1 flags.go:27] FLAG: --cluster-name="kubernetes"
ERROR: logging before flag.Parse: I1126 16:07:37.724941       1 flags.go:27] FLAG: --concurrent-service-syncs="1"
ERROR: logging before flag.Parse: I1126 16:07:37.724948       1 flags.go:27] FLAG: --configure-cloud-routes="true"
ERROR: logging before flag.Parse: I1126 16:07:37.724953       1 flags.go:27] FLAG: --contention-profiling="false"
ERROR: logging before flag.Parse: I1126 16:07:37.724958       1 flags.go:27] FLAG: --controller-start-interval="0s"
ERROR: logging before flag.Parse: I1126 16:07:37.724965       1 flags.go:27] FLAG: --feature-gates=""
ERROR: logging before flag.Parse: I1126 16:07:37.724974       1 flags.go:27] FLAG: --help="false"
ERROR: logging before flag.Parse: I1126 16:07:37.724979       1 flags.go:27] FLAG: --http2-max-streams-per-connection="0"
ERROR: logging before flag.Parse: I1126 16:07:37.724986       1 flags.go:27] FLAG: --kube-api-burst="30"
ERROR: logging before flag.Parse: I1126 16:07:37.724991       1 flags.go:27] FLAG: --kube-api-content-type="application/vnd.kubernetes.protobuf"
ERROR: logging before flag.Parse: I1126 16:07:37.724997       1 flags.go:27] FLAG: --kube-api-qps="20"
ERROR: logging before flag.Parse: I1126 16:07:37.725005       1 flags.go:27] FLAG: --kubeconfig=""
ERROR: logging before flag.Parse: I1126 16:07:37.725010       1 flags.go:27] FLAG: --leader-elect="true"
ERROR: logging before flag.Parse: I1126 16:07:37.725015       1 flags.go:27] FLAG: --leader-elect-lease-duration="15s"
ERROR: logging before flag.Parse: I1126 16:07:37.725020       1 flags.go:27] FLAG: --leader-elect-renew-deadline="10s"
ERROR: logging before flag.Parse: I1126 16:07:37.725025       1 flags.go:27] FLAG: --leader-elect-resource-lock="endpoints"
ERROR: logging before flag.Parse: I1126 16:07:37.725030       1 flags.go:27] FLAG: --leader-elect-retry-period="2s"
ERROR: logging before flag.Parse: I1126 16:07:37.725036       1 flags.go:27] FLAG: --linodego-debug="false"
ERROR: logging before flag.Parse: I1126 16:07:37.725041       1 flags.go:27] FLAG: --log-backtrace-at=":0"
ERROR: logging before flag.Parse: I1126 16:07:37.725051       1 flags.go:27] FLAG: --log-dir=""
ERROR: logging before flag.Parse: I1126 16:07:37.725056       1 flags.go:27] FLAG: --log-flush-frequency="5s"
ERROR: logging before flag.Parse: I1126 16:07:37.725061       1 flags.go:27] FLAG: --logtostderr="true"
ERROR: logging before flag.Parse: I1126 16:07:37.725066       1 flags.go:27] FLAG: --master=""
ERROR: logging before flag.Parse: I1126 16:07:37.725071       1 flags.go:27] FLAG: --min-resync-period="12h0m0s"
ERROR: logging before flag.Parse: I1126 16:07:37.725077       1 flags.go:27] FLAG: --node-monitor-period="5s"
ERROR: logging before flag.Parse: I1126 16:07:37.725082       1 flags.go:27] FLAG: --node-status-update-frequency="5m0s"
ERROR: logging before flag.Parse: I1126 16:07:37.725087       1 flags.go:27] FLAG: --node-sync-period="0s"
ERROR: logging before flag.Parse: I1126 16:07:37.725092       1 flags.go:27] FLAG: --port="0"
ERROR: logging before flag.Parse: I1126 16:07:37.725659       1 flags.go:27] FLAG: --profiling="false"
ERROR: logging before flag.Parse: I1126 16:07:37.725676       1 flags.go:27] FLAG: --route-reconciliation-period="10s"
ERROR: logging before flag.Parse: I1126 16:07:37.725683       1 flags.go:27] FLAG: --secure-port="10253"
ERROR: logging before flag.Parse: I1126 16:07:37.725701       1 flags.go:27] FLAG: --stderrthreshold="2"
ERROR: logging before flag.Parse: I1126 16:07:37.725707       1 flags.go:27] FLAG: --tls-cert-file=""
ERROR: logging before flag.Parse: I1126 16:07:37.725713       1 flags.go:27] FLAG: --tls-cipher-suites="[]"
ERROR: logging before flag.Parse: I1126 16:07:37.725731       1 flags.go:27] FLAG: --tls-min-version=""
ERROR: logging before flag.Parse: I1126 16:07:37.725736       1 flags.go:27] FLAG: --tls-private-key-file=""
ERROR: logging before flag.Parse: I1126 16:07:37.725741       1 flags.go:27] FLAG: --tls-sni-cert-key="[]"
ERROR: logging before flag.Parse: I1126 16:07:37.725750       1 flags.go:27] FLAG: --use-service-account-credentials="false"
ERROR: logging before flag.Parse: I1126 16:07:37.725756       1 flags.go:27] FLAG: --v="3"
ERROR: logging before flag.Parse: I1126 16:07:37.725761       1 flags.go:27] FLAG: --version="false"
ERROR: logging before flag.Parse: I1126 16:07:37.725786       1 flags.go:27] FLAG: --vmodule=""
ERROR: logging before flag.Parse: W1126 16:07:37.726762       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
ERROR: logging before flag.Parse: W1126 16:07:37.743830       1 authentication.go:55] Authentication is disabled
ERROR: logging before flag.Parse: I1126 16:07:37.743896       1 serve.go:96] Serving securely on [::]:10253
ERROR: logging before flag.Parse: I1126 16:07:37.745494       1 leaderelection.go:185] attempting to acquire leader lease  kube-system/cloud-controller-manager...
ERROR: logging before flag.Parse: I1126 16:07:37.901579       1 leaderelection.go:194] successfully acquired lease kube-system/cloud-controller-manager
ERROR: logging before flag.Parse: I1126 16:07:37.906483       1 node_controller.go:89] Sending events to api server.
ERROR: logging before flag.Parse: I1126 16:07:37.907681       1 reflector.go:202] Starting reflector *v1.Service (0s) from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:37.907854       1 reflector.go:240] Listing and watching *v1.Service from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:37.908661       1 event.go:221] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"kube-system", Name:"cloud-controller-manager", UID:"46e55630-e01e-463f-9c7e-db2ea8759096", APIVersion:"v1", ResourceVersion:"3204", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' li1406-143.members.linode.com_80b62809-3001-11eb-9103-f23c924ffb7c became leader
ERROR: logging before flag.Parse: I1126 16:07:37.910077       1 pvlcontroller.go:107] Starting PersistentVolumeLabelController
ERROR: logging before flag.Parse: I1126 16:07:37.910174       1 controller_utils.go:1025] Waiting for caches to sync for persistent volume label controller
ERROR: logging before flag.Parse: I1126 16:07:37.910366       1 reflector.go:202] Starting reflector *v1.PersistentVolume (0s) from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:37.910424       1 reflector.go:240] Listing and watching *v1.PersistentVolume from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:37.910854       1 controllermanager.go:264] Will not configure cloud provider routes for allocate-node-cidrs: false, configure-cloud-routes: true.
ERROR: logging before flag.Parse: I1126 16:07:37.911017       1 service_controller.go:183] Starting service controller
ERROR: logging before flag.Parse: I1126 16:07:37.911030       1 controller_utils.go:1025] Waiting for caches to sync for service controller
ERROR: logging before flag.Parse: I1126 16:07:38.010438       1 controller_utils.go:1032] Caches are synced for persistent volume label controller
ERROR: logging before flag.Parse: I1126 16:07:38.013170       1 reflector.go:202] Starting reflector *v1.Service (30s) from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:38.013207       1 reflector.go:240] Listing and watching *v1.Service from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:38.013666       1 reflector.go:202] Starting reflector *v1.Node (14h44m20.885595692s) from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:38.013694       1 reflector.go:240] Listing and watching *v1.Node from pkg/mod/k8s.io/[email protected]+incompatible/tools/cache/reflector.go:99
ERROR: logging before flag.Parse: I1126 16:07:38.115381       1 controller_utils.go:1032] Caches are synced for service controller
ERROR: logging before flag.Parse: I1126 16:07:38.115568       1 service_controller.go:636] Detected change in list of current cluster nodes. New node set: map[li1403-18.members.linode.com:{} li1483-253.members.linode.com:{} li1800-66.members.linode.com:{}]
ERROR: logging before flag.Parse: I1126 16:07:38.115687       1 service_controller.go:644] Successfully updated 0 out of 0 load balancers to direct traffic to the updated set of nodes
ERROR: logging before flag.Parse: I1126 16:07:38.115825       1 service_controller.go:326] Not persisting unchanged LoadBalancerStatus for service kube-system/kube-dns to registry.
ERROR: logging before flag.Parse: I1126 16:07:38.115850       1 service_controller.go:326] Not persisting unchanged LoadBalancerStatus for service calico-system/calico-typha to registry.
ERROR: logging before flag.Parse: I1126 16:07:38.115866       1 service_controller.go:326] Not persisting unchanged LoadBalancerStatus for service default/kubernetes to registry.
ERROR: logging before flag.Parse: E1126 16:07:38.825542       1 node_controller.go:353] failed to set node provider id: failed to get instance ID from cloud provider: instance not found
ERROR: logging before flag.Parse: E1126 16:07:39.080552       1 node_controller.go:417] NodeAddress: Error fetching by providerID: providerID cannot be empty string Error fetching by NodeName: instance not found
ERROR: logging before flag.Parse: E1126 16:07:39.292917       1 node_controller.go:353] failed to set node provider id: failed to get instance ID from cloud provider: instance not found
ERROR: logging before flag.Parse: E1126 16:07:39.484441       1 node_controller.go:417] NodeAddress: Error fetching by providerID: providerID cannot be empty string Error fetching by NodeName: instance not found
ERROR: logging before flag.Parse: E1126 16:07:39.754549       1 node_controller.go:353] failed to set node provider id: failed to get instance ID from cloud provider: instance not found
ERROR: logging before flag.Parse: E1126 16:07:39.939516       1 node_controller.go:417] NodeAddress: Error fetching by providerID: providerID cannot be empty string Error fetching by NodeName: instance not found
ERROR: logging before flag.Parse: E1126 16:07:40.178981       1 node_controller.go:353] failed to set node provider id: failed to get instance ID from cloud provider: instance not found

and coredns stuck in pending state.
Is there is another thing what must be done except described in manual or ccm does not support k8s 19.3?
kubelet kubectl kubead v 19.3


Bug Reporting

Expected Behavior

ccm pods starting properly & Linode NodeBalancer created on creating service with type LoadBalancer

Actual Behavior

ccm pods started with state Running and errors in log, k8s coredns pods stuck in pending state

Steps to Reproduce the Problem

  1. Create k8s cluster on Linode VMs
  2. Generate ccm manifest with Linode account api access token with full access
  3. apply generated manifest to cluster

Environment Specifications

VMs with centos7

Screenshots, Code Blocks, and Logs

Additional Notes


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Deployment not fit for production use

I noticed, that generated CCM deployment use latest tag, this disallow use it in production clusters. You should use versioning with compatibility table (ccm version - k8s compatible versions).
Many companies won't use managed k8s solution and preffer do deploy k8s by itself.

Toolchain directive is missing from go.mod files

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Bug Reporting

The toolchain directive is missing from go.mod files. The sad part is, once we add them, lint actions on github actions would fail.

https://github.com/linode/linode-cloud-controller-manager/actions/runs/8650313748/job/23719272139#step:5:37


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Set PTR for Linode nodes IPs

Standalone Linodes have the option to set Reverse DNS (RDNS / PTR record on the IP address).

For LKE: While the underlying Linodes in a LKE cluster is still accessible (and we can set PTR record on the Linodes there), this won't survive a node recycle action.

Would it be possible to declare the desired PTR setting for all nodes in a LKE cluster? This would allow any node to send emails with the correct PTR setting, even if the nodes are recycled.

Add Firewall ACL Rules as an Annotation

Now that we have firewall support we should add to the story to passing in information for to create the firewall and access rules. Add enough annotations to create a firewall with ACLs, create it and add it to the nodebalancer. Bonus points... due to account limits try to find out a way to reuse firewalls if the ACL rules are identical. This might be easy with config maps? Perhaps the rules are in there and when to service annotations match the same config map you only use the one resource in Linode.

CCM Doesn't Work When VLAN IP Is Configured As Node's Internal IP

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Bug Reporting

The problem is that the Linode API /instance/{instance_id}/ips endpoint does not include VLAN IP address in its response. That causes this check to fail because it does not find the configured IP address in the list of IPs that the CCM provided for the instance:

https://github.com/kubernetes-sigs/cloud-provider-azure/blob/deef381acb3c34b95223c3789e7a3edbbc32c1b4/pkg/nodemanager/nodemanager.go#L558

Expected Behavior

VLAN IP should be supported for the node's internal IP

Steps to Reproduce the Problem

  1. Deploy CCM onto a new single node RKE cluster with the internal IP set to VLAN IP
  2. Check logs of the CCM pod

Environment Specifications

Kubernetes: 1.21.7
RKE: 1.3.15

Screenshots, Code Blocks, and Logs

E1130 18:33:49.503962       1 node_controller.go:212] error syncing 'rancher-dev': failed to get node modifiers from cloud provider: failed to find kubelet node IP from cloud provider, requeuing
2022/11/30 18:34:22.375645 DEBUG RESTY
==============================================================================
~~~ REQUEST ~~~
GET  /v4/linode/instances/redacted  HTTP/1.1
HOST   : api.linode.com
HEADERS:
Accept: application/json
Authorization: Bearer redacted
Content-Type: application/json
User-Agent: linode-cloud-controller-manager linodego/v0.32.2 https://github.com/linode/linodego
BODY   :
***** NO CONTENT *****
------------------------------------------------------------------------------
~~~ RESPONSE ~~~
STATUS       : 200 OK
RECEIVED AT  : 2022-11-30T18:34:22.375520522Z
TIME DURATION: 95.928433ms
HEADERS      :
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Authorization, Origin, X-Requested-With, Content-Type, Accept, X-Filter
Access-Control-Allow-Methods: HEAD, GET, OPTIONS, POST, PUT, DELETE
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Status
Cache-Control: private, max-age=0, s-maxage=0, no-cache, no-store, private, max-age=60, s-maxage=60
Connection: keep-alive
Content-Length: 736
Content-Security-Policy: default-src 'none'
Content-Type: application/json
Date: Wed, 30 Nov 2022 18:34:22 GMT
Retry-After: 27
Server: nginx
Strict-Transport-Security: max-age=31536000
Vary: Authorization, X-Filter, Authorization, X-Filter
X-Accepted-Oauth-Scopes: linodes:read_only
X-Content-Type-Options: nosniff
X-Customer-Uuid: redacted
X-Frame-Options: DENY, DENY
X-Oauth-Scopes: *
X-Ratelimit-Limit: 800
X-Ratelimit-Remaining: 796
X-Ratelimit-Reset: 1669833290
X-Spec-Version: 4.141.0
X-Xss-Protection: 1; mode=block
BODY         :
{
"id": redacted,
"label": "redacted",
"group": "redacted",
"status": "running",
"created": "2022-11-28T23:38:51",
"updated": "2022-11-28T23:38:51",
"type": "g6-standard-4",
"ipv4": [
"redacted",
"redacted"
],
"ipv6": "redacted",
"image": null,
"region": "us-southeast",
"specs": {
"disk": 163840,
"memory": 8192,
"vcpus": 4,
"gpus": 0,
"transfer": 5000
},
"alerts": {
"cpu": 360,
"network_in": 10,
"network_out": 10,
"transfer_quota": 80,
"io": 10000
},
"backups": {
"enabled": false,
"available": false,
"schedule": {
"day": null,
"window": null
},
"last_successful": null
},
"hypervisor": "kvm",
"watchdog_enabled": true,
"tags": [
"redacted"
],
"host_uuid": "redacted"
}
==============================================================================
2022/11/30 18:34:22.496147 DEBUG RESTY
==============================================================================
~~~ REQUEST ~~~
GET  /v4/linode/instances/redacted/ips  HTTP/1.1
HOST   : api.linode.com
HEADERS:
Accept: application/json
Authorization: Bearer redacted
Content-Type: application/json
User-Agent: linode-cloud-controller-manager linodego/v0.32.2 https://github.com/linode/linodego
BODY   :
***** NO CONTENT *****
------------------------------------------------------------------------------
~~~ RESPONSE ~~~
STATUS       : 200 OK
RECEIVED AT  : 2022-11-30T18:34:22.496016304Z
TIME DURATION: 120.189321ms
HEADERS      :
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Authorization, Origin, X-Requested-With, Content-Type, Accept, X-Filter
Access-Control-Allow-Methods: HEAD, GET, OPTIONS, POST, PUT, DELETE
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Status
Cache-Control: private, max-age=0, s-maxage=0, no-cache, no-store, private, max-age=60, s-maxage=60
Connection: keep-alive
Content-Length: 973
Content-Security-Policy: default-src 'none'
Content-Type: application/json
Date: Wed, 30 Nov 2022 18:34:22 GMT
Retry-After: 27
Server: nginx
Strict-Transport-Security: max-age=31536000
Vary: Authorization, X-Filter, Authorization, X-Filter
X-Accepted-Oauth-Scopes: linodes:read_only
X-Content-Type-Options: nosniff
X-Customer-Uuid: redacted
X-Frame-Options: DENY, DENY
X-Oauth-Scopes: *
X-Ratelimit-Limit: 800
X-Ratelimit-Remaining: 798
X-Ratelimit-Reset: 1669833290
X-Spec-Version: 4.141.0
X-Xss-Protection: 1; mode=block
BODY         :
{
"ipv4": {
"public": [
{
"address": "redacted",
"gateway": "redacted",
"subnet_mask": "255.255.255.0",
"prefix": 24,
"type": "ipv4",
"public": true,
"rdns": "redacted",
"linode_id": redacted,
"region": "us-southeast"
}
],
"private": [
{
"address": "redacted",
"gateway": null,
"subnet_mask": "255.255.128.0",
"prefix": 17,
"type": "ipv4",
"public": false,
"rdns": null,
"linode_id": redacted,
"region": "us-southeast"
}
],
"shared": [],
"reserved": []
},
"ipv6": {
"slaac": {
"address": "redacted",
"gateway": "fe80::1",
"subnet_mask": "ffff:ffff:ffff:ffff::",
"prefix": 64,
"type": "ipv6",
"rdns": null,
"linode_id": redacted,
"region": "us-southeast",
"public": true
},
"link_local": {
"address": "fe80::f03c:93ff:fe09:0573",
"gateway": "fe80::1",
"subnet_mask": "ffff:ffff:ffff:ffff::",
"prefix": 64,
"type": "ipv6",
"rdns": null,
"linode_id": redacted,
"region": "us-southeast",
"public": false
},
"global": []
}
}
==============================================================================

Support tagging resources

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Would like to specify default tags that would be applied to all resources created. This would be useful for identifying and reaping resources in a multi-cluster account.

Stale NBs are left when deleting the underlying K8S cluster

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Bug Reporting

When using the Linode CCM in K8S clusters and deleting the underlying cluster while not deleting the Service leaves a stable NB in the Linode account.

Expected Behavior

Explore having some kind of finalization mechanism to delete the NB when the cluster is being deleted

Actual Behavior

Stale NBs are left in place

Steps to Reproduce the Problem

  1. Create a K8S cluster with the linode-ccm installed on it
  2. Provision a Service of type LoadBalancer
  3. Delete the control plane and nodes of the cluster

Environment Specifications

This happens for KPP clusters but also was reported by externally facing customers who are not using KPP

Additional Notes

I'm not really sure we can address this in the linode-ccm and may need to address it at a platform level but wanted to have this logged and tracked somewhere


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

limit number of goroutines we create

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

We would like to limit the number of goroutines we create so that we are not hitting linode API too many times in a very short period of time. This was a feedback in #184 (comment).

It can be achieved by setting g.SetLimit(<some number>). We would like to identify what a good number is to start with and then update linode-ccm to use it.


add route_controller to linode-cloud-controller-manager

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Each node gets a podCIDR allocated by k8s. When k8s clusters are provisioned within VPCs, we would like to configure routes such that the pod traffic flows via the VPC interface. In short, we need to allow those subnets on the linodes in VPC. In cloud-controller-manager, we can leverage route_controller to add/update routes as nodes are added/updated.

We would like to add implementation for route_controller to linode-cloud-controller-manager so that it automatically updates routes when new podCIDRs are assigned to nodes.

Support Firewall attachments to nodebalancers

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Support attaching a firewall to a node-balancer. The preferred approach would be to attach this by setting an annotation on the service. The ccm should attach the firewall specified in the annotation service.beta.kubernetes.io/linode-loadbalancer-firewall-id .

Per-port configs via annotations

The annotations are very useful to configure NodeBalancer options, thank you. It appears the only port-specific options are for protocol and TLS cert, and the rest apply globally to the balancer on every port configuration.

It would be helpful to define options on a per-port basis, the way they're arranged in the Cloud Manager. For example, different services have different compatibility with proxy-protocol, and might have different health check requirements. I'm using Traefik for ingress, and as more non-HTTP services are added as entrypoints, ports are added to the same LoadBalancer resource yet it becomes unreasonable to apply the same NB config to all equally.

One route might be to allow the linode-loadbalancer-port-* JSON entries to contain other annotation key/value pairs, which merge into and override the global values before being passed to the underlying NodeBalancer API for that port-config. Does that make sense?

Thanks!

tcp 0.0.0.0:10260: bind: address already in use

General:

  • [v] Have you removed all sensitive information, including but not limited to access keys and passwords?
  • [v] Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Bug Reporting

Expected Behavior

ccm pod starts on each node

Actual Behavior

ccm pod fails with an error

error: failed to create listener: failed to listen on 0.0.0.0:10260: listen tcp 0.0.0.0:10260: bind: address already in use

Steps to Reproduce the Problem

  1. Launch RKE2 cluster, can be smallest/simplest, one node
  2. Install ccm-linode version 0.4.3
  3. Edit Daemon Set to remove control-plane non-toleration
  4. Observe error mentioned above

Environment Specifications

Screenshots, Code Blocks, and Logs

Additional Notes

I've tested and confirmed it works fine with 0.4.2.
There may be an argument that it's disabled to run ccm on control plane by default, but it's useful to be able to do so, especially on small/test clusters.


The Linode Community is a great place to get additional support.

CCM needs agent/worker nodes to create the automatic loadbalancer for a service/ingress

I suppose the easiest way to reproduce, would be to create a single node cluster with K3s and make it a master. Apply the generated ccm-linode.yaml following the howto and then try to LoadBalance a service. It will fail. Add an agent node to the cluster and then it will succeed.

If I'm correct about this, it is limiting our options.
For example, I have setup a HA K3s for rancher and planning to use it, to create more clusters. The 3 control plane nodes should be able to do the whole job. There should be no need to add agent nodes, because you would also need at least 2.

I thought I'd just reach out and see if this is something I'm doing wrong and there is a label, or an annotation I'm missing, that might solve my problem.

Creating a LoadBalancer using the CCM

I created a custom RKE2 cluster on Linode using rancher. I then had to manually install the CCM driver as explained here: https://www.linode.com/docs/guides/install-the-linode-ccm-on-unmanaged-kubernetes.

Once that was all configured, I tried installing traefik which deploys a LoadBalancer, however, I get the following error in the CCM:

Error syncing load balancer: failed to ensure load balancer: [400]
[configs[0].nodes[0].address] Must be in address:port format;
[configs[0].nodes[1].address] Must be in address:port format;
[configs[0].nodes[2].address] Must be in address:port format;
[configs[1].nodes[0].address] Must be in address:port format;
[configs[1].nodes[1].address] Must be in address:port format;
[configs[1].nodes[2].address] Must be in address:port format

Below is the generated service from the traefik helm chart

apiVersion: v1
kind: Service
metadata:
  name: traefik
  namespace: traefik
  uid: c2e99808-93ff-4fa2-a537-b61cb0abc352
  resourceVersion: '22974'
  creationTimestamp: '2024-01-03T04:09:48Z'
  labels:
    app.kubernetes.io/instance: traefik-traefik
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: traefik
    helm.sh/chart: traefik-25.0.0
  annotations:
    meta.helm.sh/release-name: traefik
    meta.helm.sh/release-namespace: traefik
  finalizers:
    - service.kubernetes.io/load-balancer-cleanup
  managedFields:
    - manager: helm
      operation: Update
      apiVersion: v1
      time: '2024-01-03T04:09:48Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:meta.helm.sh/release-name: {}
            f:meta.helm.sh/release-namespace: {}
          f:labels:
            .: {}
            f:app.kubernetes.io/instance: {}
            f:app.kubernetes.io/managed-by: {}
            f:app.kubernetes.io/name: {}
            f:helm.sh/chart: {}
        f:spec:
          f:allocateLoadBalancerNodePorts: {}
          f:externalTrafficPolicy: {}
          f:internalTrafficPolicy: {}
          f:ports:
            .: {}
            k:{"port":80,"protocol":"TCP"}:
              .: {}
              f:name: {}
              f:port: {}
              f:protocol: {}
              f:targetPort: {}
            k:{"port":443,"protocol":"TCP"}:
              .: {}
              f:name: {}
              f:port: {}
              f:protocol: {}
              f:targetPort: {}
          f:selector: {}
          f:sessionAffinity: {}
          f:type: {}
    - manager: linode-cloud-controller-manager-linux-amd64
      operation: Update
      apiVersion: v1
      time: '2024-01-03T04:09:48Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:finalizers:
            .: {}
            v:"service.kubernetes.io/load-balancer-cleanup": {}
      subresource: status
  selfLink: /api/v1/namespaces/traefik/services/traefik
status:
  loadBalancer: {}
spec:
  ports:
    - name: web
      protocol: TCP
      port: 80
      targetPort: web
      nodePort: 30328
    - name: websecure
      protocol: TCP
      port: 443
      targetPort: websecure
      nodePort: 32498
  selector:
    app.kubernetes.io/instance: traefik-traefik
    app.kubernetes.io/name: traefik
  clusterIP: 10.43.58.100
  clusterIPs:
    - 10.43.58.100
  type: LoadBalancer
  sessionAffinity: None
  externalTrafficPolicy: Cluster
  ipFamilies:
    - IPv4
  ipFamilyPolicy: SingleStack
  allocateLoadBalancerNodePorts: true
  internalTrafficPolicy: Cluster

I am not sure if this is a misconfiguration on my side or Linodes API. Anything would help.

Replacing Terraform to Linode CAPI provider

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Bug Reporting

Linode Terraform provider would be deprecated if not already has been. Would be nice to replace it to an uptodate solution. Linode CAPI provider would be a good choice. For more information, please follow the corresponding discussion here: linode/cluster-api-provider-linode#6

Expected Behavior

Replace cluser management scripts with Kubernetes Golang client calls:

return RunScript("create_cluster.sh", ApiToken, cluster, Image, k8s_version, region)

  • Start CAPI provider with it's dependencies.
  • Execute tests
  • Tear down test environment
  • Upload logs to github

Regarding to CAPI provisioning I think speed is the main factor, because we would like to test project on CI. I think a full featured Kubernetes environment is a bit overkill for this task. So my suggestion is to spinning up a lightweight Kcp control plane and the capi provider(s) via Docker compose. Plain, old, boring but fast!


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Update CCM deployment manifest to use Kustomize

The Linode blockstorage CSI driver's deployment manifest is generated using Kustomize. Currently, the CCM uses a shell script to generate the CCM manifest. It would be nice to update the CCM manifest to use Kustomize, to be in-line with the Linode CSI.

Here is an example of a kustomize plugin which allows dynamic generation of kustomize manifests, which may be helpful in providing some of the functionality of the current shell script, such as inserting tokens: https://github.com/asauber/kustomize-pubkey

CCM-Managed firewalls for Linodes

Recently CCM-managed firewall support was added although this only applies for NodeBalancers. We'd like to have this feature also added for managing firewalls for Nodes themselves, likely based on Node annotations similar to the existing Service annotation support:

kind: Node
apiVersion: v1
metadata:
  annotations:
    node.alpha.kubernetes.io/linode-firewall-acl: |
      {
        "allowList": {
          "ipv4": ["8.8.8.8/32"],
          "ipv6": ["dead:beef::/64"]
        }
      }

This would be used by CAPL to request Firewalls be created and configured for workload clusters (see linode/cluster-api-provider-linode#169 which reuses some of the CCM's firewall logic).

Nodebalancer rebuilds don't send node ID's which causes the API to reload nodebalancers configs when nodes have not been changed

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Bug Reporting

See title

Expected Behavior

the requests to nodebalancers/%d/configs/%d should contain pre-existing node id's, only new nodes should be sent without the id parameter.

Actual Behavior

All nodes are currently sent without ID, so API treats them as new nodes and deletes all the existing.

Steps to Reproduce the Problem

  1. Run CCM with LINODE_DEBUG env variable set to 1 to see requests to API
  2. Create nodebalancer backed service with one node.
  3. Add one more node to the node pool.
  4. Check requests in logs and you will see that it would issue nodebalancer config rebuild request and the node ids will be missing.

Environment Specifications

Screenshots, Code Blocks, and Logs

Additional Notes


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Associate Nodebalancers to Clusters Better

Two ideas, looking for more if they're out there...

  1. add the cluster name in the name of the nodebalancer e.g. ccm-cluster-name-1174efedab186000
  2. Auto-tag new nodebalancers with the cluster name

Firewall annotation cannot work with long service names

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Feature Requests:

  • Have you explained your rationale for why this feature is needed?
  • Have you offered a proposed implementation/solution?

Bug Reporting

When using the service.beta.kubernetes.io/linode-loadbalancer-firewall-acl annotation, if the service name is too long, it causes errors creating the firewall

Expected Behavior

Service name should be able to be as long as needed in the k8s resource and stripped as needed by the CCM when creating resources

Actual Behavior

Linode API rejects the requests coming from the CCM and the firewall doesn't get created:

I0207 20:43:21.111439       1 loadbalancers.go:735] found NodeBalancer (000000) for service (ingress/banana-banana-banana-banana) via IPv4 (X.X.X.X)
E0207 20:43:21.494953       1 controller.go:307] error processing service ingress/ingress/banana-banana-banana-banana (will retry): failed to ensure load balancer: [400] [rules.inbound[0].label] Length must be 3-32 characters

Steps to Reproduce the Problem

  1. Create a k8s service with spec.name: banana-banana-banana-banana and spec.metadata.annotations.service.beta.kubernetes.io/linode-loadbalancer-firewall-acl: { "allowList": { "ipv4": ["1.2.3.4/32"] }}

Environment Specifications

Screenshots, Code Blocks, and Logs

Additional Notes

@schinmai-akamai said it's a bug :)


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Fix Update/Delete Usage of Firewall Association

The APIs do not currently exist to update or delete a firewall from a nodebalancer. When things like TPT-2535 are completed we can update the logic in CCM to catch annotation changes and update/delete the firewall association.

Revoke a kubeconfig credentials for an LKE Cluster.

Bug Reporting

Currently, it's not possible to revoke and regenerate the kubeconfig.yaml file for an LKE Cluster.

Expected Behavior

If a KubeConfig credential is suspected to be leaked, you can revoke the KubeConfig credential to ensure cluster security.

Actual Behavior

It's no easy way to revoke a KubeConfig credential in the default LKE Cluster.

CCM does not properly detect nodes that are powered off

Bug Reporting

CCM does not properly detect nodes that are powered off.

Expected Behavior

On shutdown of a Kubernetes node, the CCM detects that it is powered off and migrates workloads off of the node.

Actual Behavior

The node status becomes NotReady after a few minutes because kubelet stops responding, but pods still have a status of Running and do not get rescheduled.

Steps to Reproduce the Problem

  1. Shut down a Kubernetes node on a cluster running the CCM, wait 5 minutes
  2. Observe node status with kubectl get nodes, note that the down node has a status of NotReady
  3. Observe pod status with kubectl get pods -A, note that pods on the node are not rescheduled

Add --authentication-skip-lookup=true flag

The example deployment manifest for the Linode CCM uses the cluster-admin role. When deploying the CCM with a limited clusterrole, the CCM fails to start if it doesn't have access to get on configmaps:

W0222 19:11:17.239825       1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system.  Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
unable to load configmap based request-header-client-ca-file: configmaps "extension-apiserver-authentication" is forbidden: User "system:serviceaccount:kube-system:default" cannot get resource "configmaps" in API group "" in the namespace "kube-system"

This seems to be related to upstream issue, see: kubernetes/cloud-provider#34

It looks like Digital Ocean ran into a similar problem after updating their K8s deps (digitalocean/digitalocean-cloud-controller-manager#217). Possible short-term solutions could be to add the --authentication-skip-lookup=true flag to the deployment example manifest, or to hardcode this option in the binary

Nodes lose Linode metadata after a reboot

Moving this Issue from the CSI repo.

linode/linode-blockstorage-csi-driver#34

Report from @wideareashb

After an OOM problem on one of our Linodes, CSI has stopped working and the Linode is not properly part of the cluster anymore.

The CoreOS logs for the node with the problem has a few of these:

systemd-networkd[604]: eth0: Could not set NDisc route or address: Connection timed out

then there is an OOM

and then more messages like the above. There are also a number of problems which seem to be caused by the OOM event

Since then we have seen a number of problems:

  • the node did not appear in 'kubectl get nodes'
  • after a reboot the node is no longer properly recognised by Kubernetes:

e.g.
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node-2 Ready 157d v1.13.0 192.168.145.28 213.xxx.xxx.xxx Container Linux by CoreOS 2135.5.0 (Rhyolite) 4.19.50-coreos-r1 docker://18.6.3
node-3 Ready 5h41m v1.13.0 Container Linux by CoreOS 2135.5.0 (Rhyolite) 4.19.50-coreos-r1 docker://18.6.3

Note:

  • the age should be 157d

  • it has not internal or external IP address

  • If you describe the node:

  • the nodes annotations like 'providerID' were missing (we have tried adding them back in);

  • the node was deScheduled

  • the node had not internal or external IP address

  • after fiddling with annotations, the node did get pods scheduled but CSI Linode is upset

Aug 01 09:12:03 node-3 kubelet[687]: , failed to "StartContainer" for "csi-linode-plugin" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=csi-linode-plugin pod=csi
Aug 01 09:12:03 node-3 kubelet[687]: ]
Aug 01 09:12:18 node-3 kubelet[687]: E0801 09:12:18.716293 687 pod_workers.go:190] Error syncing pod c55016b7-b439-11e9-a66e-f23c914badbb ("csi-linode-node-4rqz6_kube-system(c55>
Aug 01 09:12:18 node-3 kubelet[687]: , failed to "StartContainer" for "csi-linode-plugin" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=csi-linode-plugin pod=csi>
Aug 01 09:12:18 node-3 kubelet[687]: ]

  • in the main CSI csi-linode-controller-0 container linode-csi-plugin logs we are seeing:

BODY :
{
"errors": [
{
"reason": "Invalid OAuth Token"
}
]
}

I think this has little to do with CSI -- after I restarted the Linode it has lost the traits it needs to work. For example if I edit this Linode I get something like the below:

metadata:
annotations:
node.alpha.kubernetes.io/ttl: "0"
projectcalico.org/IPv4Address: 192.168.xxx.yyy/17
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2019-08-01T15:28:39Z"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/hostname: node-3
name: widearea-live-node-3
resourceVersion: "30261967"
selfLink: /api/v1/nodes/node-3
uid: 097dd8b3-xxxxxx
spec:
podCIDR: 10.244.5.0/24
taints:

  • effect: NoSchedule
    key: node.cloudprovider.kubernetes.io/uninitialized
    value: "true"
    but for a working one I get:

metadata:
annotations:
csi.volume.kubernetes.io/nodeid: '{"linodebs.csi.linode.com":"1234567"}'
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: "0"
projectcalico.org/IPv4Address: 192.168.xxx.yyyy/17
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2019-02-24T19:26:35Z"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/instance-type: g6-standard-4
beta.kubernetes.io/os: linux
failure-domain.beta.kubernetes.io/region: eu-west
kubernetes.io/hostname: node-2
topology.linode.com/region: eu-west
name: node-2
resourceVersion: "30262732"
selfLink: /api/v1/nodes/node-2
uid: 19841cc5-xxxxxxxx
spec:
podCIDR: 10.244.1.0/24
providerID: linode://1234567
status:
addresses:

  • address: live-node-2
    type: Hostname
  • address: 213.x.y.z
    type: ExternalIP
  • address: 192.168.aaa.bbbb
    type: InternalIP
    allocatable:
    attachable-volumes-csi-linodebs.csi.linode.com: "8"
    Our problem has lost all its Linode traits.

I am looking into this and attempting to reproduce.

Nodebalancer not updated when adding new nodes to the Kubernetes cluster

General:

  • Have you removed all sensitive information, including but not limited to access keys and passwords?
  • Have you checked to ensure there aren't other open or closed Pull Requests for the same bug/feature/question?

Bug Reporting

Expected Behavior

When adding/removing nodes in a cluster, existing Nodebalancer endpoints should be updated to reflect the change.

Actual Behavior

Nodebalancers will only ever point to the nodes that were present when the k8s LoadBalancer was created.

Steps to Reproduce the Problem

  1. Provision a cluster (we're using terraform-linode-k8s)
  2. Create a Service with the LoadBalancer type. Our specific use case is nginx-ingress (helm install stable/nginx-ingress)
  3. Once the Nodebalancer is provisioned, add or remove some nodes from the k8s cluster.

Environment Specifications

  • Kubernetes environment provisioned with terraform-linode-k8s
  • g6-standard-4 nodes
  • Increasing node count from 4 to 8 caused the issue this time. In the past I've encountered it while removing nodes.

Screenshots, Code Blocks, and Logs

kubectl describe service nginx-ingress spits out error events:

Events:
  Type     Reason                    Age    From                Message
  ----     ------                    ----   ----                -------
  Warning  LoadBalancerUpdateFailed  74s    service-controller  Error updating load balancer with new hosts map[prod-ap-northeast-node-7:{} prod-ap-northeast-node-8:{} prod-ap-northeast-node-5:{} prod-ap-northeast-node-6:{} prod-ap-northeast-node-4:{} prod-ap-northeast-node-1:{} prod-ap-northeast-node-3:{} prod-ap-northeast-node-2:{}]: [400] [X-Filter] Cannot filter on nodebalancer_id

Additional Notes


For general help or discussion, join the Kubernetes Slack team channel #linode. To sign up, use the Kubernetes Slack inviter.

The Linode Community is a great place to get additional support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.