Code Monkey home page Code Monkey logo

volume-expander-operator's Introduction

Volume Expander Operator

The purpose of the volume-expander-operator is to expand volumes when they are running out of space. This is achieved by using the volume expansion feature.

The operator periodically checks the kubelet_volume_stats_used_bytes and kubelet_volume_stats_capacity_bytes published by the kubelets to decide when to expand a volume. Notice that these metrics are generated only when a volume is mounted to a pod. Also the kubelet takes a minute or two to start generating accurate values for these metrics. The operator accounts for that.

This operator works based on the following annotations to PersistentVolumeClaim resources:

Annotation Default Description
volume-expander-operator.redhat-cop.io/autoexpand N/A if set to "true" enables the volume-expander-operator to watch on this PVC
volume-expander-operator.redhat-cop.io/polling-frequency "30s" How frequently to poll the volume metrics. Express this value as a valid golang Duration
volume-expander-operator.redhat-cop.io/expand-threshold-percent "80" the percentage of used storage after which the volume will be expanded. This must be a positive integer.
volume-expander-operator.redhat-cop.io/expand-by-percent "25" the percentage by which the volume will be expanded, relative to the current size. This must be an integer between 0 and 100
volume-expander-operator.redhat-cop.io/expand-up-to MaxInt64 the upper bound for this volume to be expanded to. The default value is the largest quantity representable and is intended to be interpreted as infinite. If the default is used it is recommend to ensure the namespace has a quota on the used storage class.

Note that not all of the storage driver implementations support volume expansion. It is a responsibility of the user/platform administrator to ensure that storage class and the persistent volume claim meet all the requirements needed for the volume expansion feature to work properly.

This operator was tested with OCS, but should work with any other storage driver that supports volume expansion.

Deploying the Operator

This is a cluster-level operator that you can deploy in any namespace, volume-expander-operator is recommended.

It is recommended to deploy this operator via OperatorHub, but you can also deploy it using Helm.

Multiarch Support

Arch Support
amd64
arm64
ppc64le
s390x

Deploying from OperatorHub

Note: This operator supports being installed disconnected environments

If you want to utilize the Operator Lifecycle Manager (OLM) to install this operator, you can do so in two ways: from the UI or the CLI.

Deploying from OperatorHub UI

  • If you would like to launch this operator from the UI, you'll need to navigate to the OperatorHub tab in the console.Before starting, make sure you've created the namespace that you want to install this operator to with the following:
oc new-project volume-expander-operator
  • Once there, you can search for this operator by name: volume expander operator. This will then return an item for our operator and you can select it to get started. Once you've arrived here, you'll be presented with an option to install, which will begin the process.
  • After clicking the install button, you can then select the namespace that you would like to install this to as well as the installation strategy you would like to proceed with (Automatic or Manual).
  • Once you've made your selection, you can select Subscribe and the installation will begin. After a few moments you can go ahead and check your namespace and you should see the operator running.

Volume Expander Operator

Deploying from OperatorHub using CLI

If you'd like to launch this operator from the command line, you can use the manifests contained in this repository by running the following:

oc new-project volume-expander-operator

oc apply -f config/operatorhub -n volume-expander-operator

This will create the appropriate OperatorGroup and Subscription and will trigger OLM to launch the operator in the specified namespace.

Deploying with Helm

Here are the instructions to install the latest release with Helm.

oc new-project volume-expander-operator
helm repo add volume-expander-operator https://redhat-cop.github.io/volume-expander-operator
helm repo update
helm install volume-expander-operator volume-expander-operator/volume-expander-operator

This can later be updated with the following commands:

helm repo update
helm upgrade volume-expander-operator volume-expander-operator/volume-expander-operator

Metrics

Prometheus compatible metrics are exposed by the Operator and can be integrated into OpenShift's default cluster monitoring. To enable OpenShift cluster monitoring, label the namespace the operator is deployed in with the label openshift.io/cluster-monitoring="true".

oc label namespace <namespace> openshift.io/cluster-monitoring="true"

Testing metrics

export operatorNamespace=volume-expander-operator-local # or volume-expander-operator
oc label namespace ${operatorNamespace} openshift.io/cluster-monitoring="true"
oc rsh -n openshift-monitoring -c prometheus prometheus-k8s-0 /bin/bash
export operatorNamespace=volume-expander-operator-local # or volume-expander-operator
curl -v -s -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://volume-expander-operator-controller-manager-metrics.${operatorNamespace}.svc.cluster.local:8443/metrics
exit

Development

Running the operator locally

Note: this operator build process is tested with podman, but some of the build files (Makefile specifically) use docker because they are generated automatically by operator-sdk. It is recommended remap the docker command to the podman command.

export repo=raffaelespazzoli
docker login quay.io/$repo
oc new-project volume-expander-operator
oc project volume-expander-operator
tilt up

Test helm chart locally

Define an image and tag. For example...

export imageRepository="quay.io/redhat-cop/volume-expander-operator"
export imageTag="$(git -c 'versionsort.suffix=-' ls-remote --exit-code --refs --sort='version:refname' --tags https://github.com/redhat-cop/volume-expander-operator.git '*.*.*' | tail --lines=1 | cut --delimiter='/' --fields=3)"

Deploy chart...

make helmchart IMG=${imageRepository} VERSION=${imageTag}
helm upgrade -i volume-expander-operator-local charts/volume-expander-operator -n volume-expander-operator-local --create-namespace

Delete...

helm delete volume-expander-operator-local -n volume-expander-operator-local

Building/Pushing the operator image

export repo=raffaelespazzoli #replace with yours
docker login quay.io/$repo
make docker-build IMG=quay.io/$repo/volume-expander-operator:latest
make docker-push IMG=quay.io/$repo/volume-expander-operator:latest

Deploy to OLM via bundle

make manifests
make bundle IMG=quay.io/$repo/volume-expander-operator:latest
operator-sdk bundle validate ./bundle --select-optional name=operatorhub
make bundle-build BUNDLE_IMG=quay.io/$repo/volume-expander-operator-controller-bundle:latest
docker push quay.io/$repo/volume-expander-operator-controller-bundle:latest
operator-sdk bundle validate quay.io/$repo/volume-expander-operator-controller-bundle:latest --select-optional name=operatorhub
oc new-project volume-expander-operator
oc label namespace volume-expander-operator openshift.io/cluster-monitoring="true"
operator-sdk cleanup volume-expander-operator -n volume-expander-operator
operator-sdk run bundle --install-mode AllNamespaces -n volume-expander-operator quay.io/$repo/volume-expander-operator-controller-bundle:latest

Testing

Manual tests

oc new-project volume-expander-operator-test
oc apply -f ./test/volume.yaml -n volume-expander-operator-test
oc apply -f ./test/deployment.yaml -n volume-expander-operator-test

Releasing

git tag -a "<tagname>" -m "<commit message>"
git push upstream <tagname>

If you need to remove a release:

git tag -d <tagname>
git push upstream --delete <tagname>

If you need to "move" a release to the current main

git tag -f <tagname>
git push upstream -f <tagname>

Cleaning up

operator-sdk cleanup volume-expander-operator -n volume-expander-operator
oc delete operatorgroup operator-sdk-og
oc delete catalogsource volume-expander-operator-catalog

volume-expander-operator's People

Contributors

cnuland avatar davgordo avatar garethahealy avatar iamtakingiteasy avatar raffaelespazzoli avatar renovate[bot] avatar sabre1041 avatar trevorbox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

volume-expander-operator's Issues

JSON Logging for volume-expander-operator

We have installed the volume expander operator but the logging format is not in JSON and for troubleshooting purposes logging should be in standard JSON format. Below are the sample logs generated for volume expander operator.

2022-02-07T11:42:51.284Z INFO controller-runtime.manager.controller.persistentvolumeclaim Starting workers {"reconciler group": "", "reconciler kind": "PersistentVolumeClaim", "worker count": 1}

broken upgrade

0.3.5 can't pull it's controller image:

Failed to pull image "quay.io/redhat-cop/volume-expander-operator@sha256:13c0ef37e384beb429a43bb814712416f5ab1b5cc70c96c78d44fa1c2c90a321": rpc error: code = Unknown desc = reading manifest sha256:13c0ef37e384beb429a43bb814712416f5ab1b5cc70c96c78d44fa1c2c90a321 in quay.io/redhat-cop/volume-expander-operator: manifest unknown: manifest unknown

bcause it does not exist:

ocker pull quay.io/redhat-cop/volume-expander-operator@sha256:13c0ef37e384beb429a43bb814712416f5ab1b5cc70c96c78d44fa1c2c90a321
Error response from daemon: manifest for quay.io/redhat-cop/volume-expander-operator@sha256:13c0ef37e384beb429a43bb814712416f5ab1b5cc70c96c78d44fa1c2c90a321 not found: manifest unknown: manifest unknown

Add a controller for Ceph storage

Hello,

We're currently using Rook ceph for Kubernetes storage.
I experimented with this controller and it works like magic! Except that for my case, ceph OSD PVCs have block volumeMode:
image
Thus, kubelet_volume_stats_used_bytes and kubelet_volume_stats_capacity_bytes metrics for OSD PVCs are unavailable in Prometheus.

The idea is to:

  • add specific annotations for ceph PVCs or cluser definition manifest (CephCluster CRD)
  • use ceph metrics instead: ceph_cluster_total_bytes and ceph_cluster_total_used_bytes
  • To be decided: update the PVC or CephCluster CRD (safer)

I'd be more than happy to contribute to the implementation of the idea if it is worth experimenting with.

++

many CVEs

The latest release is quite old - and if you scan the image with RHACS it will show a number of vulnerabilities.
Should perhaps dependabot be added - some upgrades done and a new release be pushed?

Incorrect Server Name in Service Monitor of Volume Expander Operator

I am trying to launch volume expander operator on openshift 4.8.28 from the command line with the below command.
oc new-project stakater-volume-expander-operator
oc apply -f config/operatorhub -n stakater-volume-expander-operator

The above command creates csv, operator group, pods, service, service monitor etc. But the created service monitor reflects incorrect server name due to which prometheus unable to populate metrices with the error "x509 certificate is valid for "volume-expander-operator-controller-manager-metrics.stakater-volume-expander-operator.svc" not valid for "volume-expander-operator-controller-manager-metrics.volume-expander-operator.svc". Below is the servicemonitor created after launching the operator.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  creationTimestamp: "2022-02-07T09:07:30Z"
  generation: 1
  labels:
    operator: volume-expander-operator
  name: volume-expander-operator-controller-manager-metrics-monitor
  namespace: stakater-volume-expander-operator
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: ClusterServiceVersion
    name: volume-expander-operator.v0.3.2
    uid: ca3e47f1-807d-4872-9be8-915a1f8e7375
  resourceVersion: "64293388"
  uid: 08891dee-fbeb-48a3-88a6-171962dfeed7
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 30s
    port: https
    scheme: https
    tlsConfig:
      caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
      serverName: volume-expander-operator-controller-manager-metrics.volume-expander-operator.svc
  selector:
    matchLabels:
      operator: volume-expander-operator

Below is the servername created after installing the operator
serverName: volume-expander-operator-controller-manager-metrics.volume-expander-operator.svc

However the serverName in servicemonitor should be volume-expander-operator-controller-manager-metrics.stakater-volume-expander-operator.svc so that prometheus will populate the metrices.

Auto expansion worked earlier then facing error "volume-expander-operator unexpected value"

Hi Team,

We are trying autoexpansion feature in our openshift environment through volume expander operator & we were getting error like
below for some of the PVCs . Yesterday expansion worked fine for one of the PVC we created but today we are getting this error & also expansion didn't take place automatically for quite some time. Later, the expansion happened automatically but the "unexpected error" event is also getting repeated.

Checked the volume expander operator logs but we don't find much in the logs to look on it further.

image

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
annotations:
pv.kubernetes.io/bind-completed: 'yes'
pv.kubernetes.io/bound-by-controller: 'yes'
volume-expander-operator.redhat-cop.io/autoexpand: 'true'
volume-expander-operator.redhat-cop.io/expand-threshold-percent: '85'
volume-expander-operator.redhat-cop.io/expand-up-to: 5Gi
volume-expander-operator.redhat-cop.io/polling-frequency: 5m
volume.beta.kubernetes.io/storage-provisioner: openshift-storage.cephfs.csi.ceph.com
selfLink: >-
/api/v1/namespaces/test13/persistentvolumeclaims/test1360344
resourceVersion: '1037639301'
name: test1360344
uid: 93549f62-1932-48db-97e0-04dfe0a6b081
creationTimestamp: '2022-03-31T09:09:21Z'
managedFields:
- manager: kube-controller-manager
operation: Update
apiVersion: v1
time: '2022-03-31T09:09:21Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:pv.kubernetes.io/bind-completed': {}
'f:pv.kubernetes.io/bound-by-controller': {}
'f:volume.beta.kubernetes.io/storage-provisioner': {}
'f:spec':
'f:volumeName': {}
'f:status':
'f:accessModes': {}
'f:capacity': {}
'f:phase': {}
- manager: okhttp
operation: Update
apiVersion: v1
time: '2022-03-31T09:09:21Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
.: {}
'f:volume-expander-operator.redhat-cop.io/autoexpand': {}
'f:volume-expander-operator.redhat-cop.io/expand-up-to': {}
'f:volume-expander-operator.redhat-cop.io/polling-frequency': {}
'f:spec':
'f:accessModes': {}
'f:resources':
'f:requests': {}
'f:storageClassName': {}
'f:volumeMode': {}
- manager: csi-resizer
operation: Update
apiVersion: v1
time: '2022-03-31T09:14:21Z'
fieldsType: FieldsV1
fieldsV1:
'f:status':
'f:capacity':
'f:storage': {}
- manager: manager
operation: Update
apiVersion: v1
time: '2022-03-31T09:14:21Z'
fieldsType: FieldsV1
fieldsV1:
'f:spec':
'f:resources':
'f:requests':
'f:storage': {}
- manager: Mozilla
operation: Update
apiVersion: v1
time: '2022-04-01T08:18:35Z'
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:annotations':
'f:volume-expander-operator.redhat-cop.io/expand-threshold-percent': {}
namespace: test13-intel
finalizers:
- kubernetes.io/pvc-protection
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2560Mi
volumeName: pvc-93549f62-1932-48db-97e0-04dfe0a6b081
storageClassName: ocs-storagecluster-cephfs
volumeMode: Filesystem
status:
phase: Bound
accessModes:
- ReadWriteMany
capacity:
storage: 3Gi

Thanks,
Fayaz

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

  • Update module github.com/onsi/ginkgo to v2

Edited/Blocked

These updates have been manually edited so Renovate will no longer make changes. To discard all commits and start over, click on a checkbox.

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

dockerfile
Dockerfile
  • golang 1.16
ci.Dockerfile
github-actions
.github/workflows/pr.yaml
  • redhat-cop/github-workflows-operators v1.0.4
.github/workflows/push.yaml
  • redhat-cop/github-workflows-operators v1.0.4
gomod
go.mod
  • go 1.16
  • github.com/go-logr/logr v0.4.0
  • github.com/json-iterator/go v1.1.11
  • github.com/onsi/ginkgo v1.16.4
  • github.com/onsi/gomega v1.13.0
  • github.com/prometheus/client_golang v1.11.0
  • github.com/prometheus/common v0.26.0
  • k8s.io/api v0.21.2
  • k8s.io/apimachinery v0.21.2
  • k8s.io/client-go v0.21.2
  • sigs.k8s.io/controller-runtime v0.9.2
kustomize
config/manager/kustomization.yaml

  • Check this box to trigger a request for Renovate to run again on this repository

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.