kvaps / kube-linstor Goto Github PK

View Code? Open in Web Editor NEW

129.0 7.0 24.0 336 KB

Containerized LINSTOR SDS for Kubernetes, ready for production use.

License: Apache License 2.0

Dockerfile 43.44% Shell 48.98% Smarty 7.58%

linstor docker zfs kubernetes drbd lvm sds

kube-linstor's People

Stargazers

Watchers

kube-linstor's Issues

Consider moving linstor-csi out from kube-system namespace

See kubernetes/kubernetes#60596 (comment) for more details

Add labels to controller service

Hello!
I'm monitoring my linstor deployment using prometheus. To do so, I'd need to label my controller service. So far, I'm using the following command line : kubectl -n linstor label svc linstor-controller app=linstor-controller-metrics.
It would be great if this could be done the same way it is possible to add annotations (.Values.controller.service.annotations)

files inside a pod stored on local-fs instead of drbd

Hy!

I'm just started playing with k8s and drbd, so maybe this is only a stupid misstake, but I'm at the end of my knowledge.

I have a 3 node microk8s cluster based on Ubuntu Server 20.04.2 ( microk8s is installed via snap and in the k8s version 1.21.1 ).
On this cluster i deployed kube-linstore v1.13.0-1 like it is described in README.md.
Started with kube-linstor i had some problems, caused on microk8s didn't assigned node-roles, so i added "node-role.kubernetes.io/master: "" " per hand to the nodes. But then the installation works fine.

My Problem is:
I create a simple test deployment but when I stop a node the pod sucks on terminating ( only force termination by hand helps ) and the creation of the new pod on the other node hangs with "multiple mount..."
So I search for the reason an it seams like the pod mount a local folder instead of a drbd device. I exec a bash on the pod and "touch /mnt1/harry_dont_understand.txt" after this i mount the drbd-volume and search for this file. The file isn't found on /dev/drbd1002 instead it is found on "/var/snap/microk8s/common/var/lib/kubelet/pods/f6eb8152-7452-42b4-bdd5-20d147bd2982/volumes/kubernetes.io~csi/pvc-98b9f72a-eaef-4d5b-b104-f61ded8e3fa0/mount/harry_dont_understand.txt".

I have no idea how to solve this Problem.

My yaml file for kubectl:

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-test
parameters:
  autoPlace: "3"
  storagePool: pool_k8s
provisioner: linstor.csi.linbit.com
--- 
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-pvc
  namespace: test
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: linstor-test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        volumeMounts:
        - name: test-volume
          mountPath: /mnt1
      volumes:
      - name: test-volume
        persistentVolumeClaim:
          claimName: "test-pvc"
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: "kubernetes.io/hostname"

linstore means:

LINSTOR ==> v l
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ Resource                                 ┊ StoragePool ┊ VolNr ┊ MinorNr ┊ DeviceName    ┊ Allocated ┊ InUse  ┊    State ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ k8s1 ┊ pvc-5e76f38c-9791-46e7-b469-ecc9922149c7 ┊ pool_k8s    ┊     0 ┊    1001 ┊ /dev/drbd1001 ┊  1.00 GiB ┊ InUse  ┊ UpToDate ┊
┊ k8s2 ┊ pvc-5e76f38c-9791-46e7-b469-ecc9922149c7 ┊ pool_k8s    ┊     0 ┊    1001 ┊ /dev/drbd1001 ┊  1.00 GiB ┊ Unused ┊ UpToDate ┊
┊ k8s3 ┊ pvc-5e76f38c-9791-46e7-b469-ecc9922149c7 ┊ pool_k8s    ┊     0 ┊    1001 ┊ /dev/drbd1001 ┊  1.00 GiB ┊ Unused ┊ UpToDate ┊
┊ k8s1 ┊ pvc-98b9f72a-eaef-4d5b-b104-f61ded8e3fa0 ┊ pool_k8s    ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊  1.00 GiB ┊ InUse  ┊ UpToDate ┊
┊ k8s2 ┊ pvc-98b9f72a-eaef-4d5b-b104-f61ded8e3fa0 ┊ pool_k8s    ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊  1.00 GiB ┊ Unused ┊ UpToDate ┊
┊ k8s3 ┊ pvc-98b9f72a-eaef-4d5b-b104-f61ded8e3fa0 ┊ pool_k8s    ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊  1.00 GiB ┊ InUse  ┊ UpToDate ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

LINSTOR ==>

the rest:

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-test
parameters:
  autoPlace: "3"
  storagePool: pool_k8s
provisioner: linstor.csi.linbit.com
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-pvc
  namespace: test
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: linstor-test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: test
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 # tells deployment to run 2 pods matching the template
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        volumeMounts:
        - name: test-volume
          mountPath: /mnt1
      volumes:
      - name: test-volume
        persistentVolumeClaim:
          claimName: "test-pvc" # i have allso tried without quotes
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: "kubernetes.io/hostname"

I hope someone can help me.

best regards
Harry

Using external Linstor controller/cluster

Is it possible to use this Helm chart with an existing external Linstor controller/cluster? So not as a usual k8s hyper-converged configuration, but via diskless attachments. Or maybe even installing just Satellites on k8s nodes.

According to Linbit's Docs you can do that with Piraeus (https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-kubernetes-deploy-external-controller), what about kube-linstor?

Unable to create snapshot

I dunno where my problem is. I m not able to get a snapshot from my linstor storage (kubernetes 1.16) Installation went well. and there are no errors in the logs. In the Controller Pod a CSI Snapshotter gets started (1.2.2). No Errors at all.

kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1alpha1
metadata:
  name: linstor-snapshot-class
  namespace: linstor
snapshotter: io.drbd.linstor-csi
---
apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  name: linstor-snapshot
  namespace: postgresql  
spec:
  snapshotClassName: linstor-snapshot-class
  source:
    name: pvc-995046d5-2308-43ae-a658-11cfea3a0d63
    kind: PersistentVolumeClaim

If i create the above code there should be a snapshot created? Right ?
If i log into the conole from linstor and get the snapshots listed there is none.
I tried different namespaces and a lot of other combinations for the kubectl code but i cant get the snapshot created.

I still can create a snapshot of the pvc from linstor console but that does not help me atm.
Any Ideas ? :)

stork scheduler is missing some authorizations in K8s 1.21

Hello,

Bumping up Kubernetes (and Stork Scheduler) to 1.21 makes linstor-stork-scheduler fail scheduling pods (some pods indefinitely remain in pending state) ; it indeed complains of missing authorizations :

E0410 22:26:05.985354       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: csistoragecapacities.storage.k8s.io is forbidden: User "system:serviceaccount:linstor:linstor-stork-scheduler" cannot list resource "csistoragecapacities" in API group "storage.k8s.io" at the cluster scope

and

E0410 22:37:57.775468       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.CSIDriver: failed to list *v1.CSIDriver: csidrivers.storage.k8s.io is forbidden: User "system:serviceaccount:linstor:linstor-stork-scheduler" cannot list resource "csidrivers" in API group "storage.k8s.io" at the cluster scope

Merely adding csistoragecapacities and csidrivers to existing rule group storage.k8s.io of linstor-stork-scheduler cluster role makes it back to work.

Cheers!

Add ZFS storage provider support

Currently when you run

linstor node create node1 xxx.xxx.xxx.xxx

the following warning is shown:

    Unsupported storage providers:
        ZFS: 'cat /sys/module/zfs/version' returned with exit code 1
        ZFS_THIN: 'cat /sys/module/zfs/version' returned with exit code 1

Prevent from running on Master nodes

Great project!
I am deploying linstor for the first time and I think I've reached almost the finish line, but now the "satellite" workload is being deployed onto all of my node's - but I only want to use the storage on my worker nodes, so I only configured them with storage. Now the Satellite workload is failing on my "master" nodes.
Should it be running on the master nodes? and if not, how do I make it not run there?

Stork 2.6.4 is incompatible with Kubernetes >=1.22

Upgrading to Stork 2.8.2 by changing the version in the dockerfiles/linstor-stork/Dockerfile gets things working. Should I provide a PR or do you explicitly not support newer versions?

upgrade

If I'm on v 1.1.2 and want to upgrade to later version,
a) which are the system requirements for each version?
b) how do I upgrade without loosing the current data/lvm configuration?
c) any release notes?

Etcd DB Conf and Certs

Hi @kvaps,

I'm looking to add ability to connect to a etcd cluster.
At the moment, we can do :

controller:
  db:
    connectionUrl: "etcd://node1:2379,node2:2379"

But etcd is secured with TLS certs so, currently, we can't make a secure connection with your helm chart.
Before making a PR, I want to know if you will be OK with my solution.

Add 4 config values, and by the way we can set etcd prefix :

controller:
  db:
    tls: true
    cert: [A base64 encoded PEM format certificate]
    key: [A base64 encoded PEM format private key]
    ca: [A base64 encoded PEM format certificate authority]
    # optional
    etcdPrefix: "/LINSTOR/"

Create kubernetes.io/tls secret db-tls.yaml :

{{- $fullName := include "linstor.fullname" . -}}
{{- if .Values.controller.enabled }}
{{- if .Values.controller.db.tls }}
---
apiVersion: v1
kind: Secret
metadata:
  name: {{ $fullName }}-db-tls
  annotations:
    "helm.sh/resource-policy": "keep"
    "helm.sh/hook": "pre-install"
    "helm.sh/hook-delete-policy": "before-hook-creation"
    "directives.qbec.io/update-policy": "never"
type: kubernetes.io/tls
data:
  tls.crt: {{ .Values.controller.db.cert }}
  tls.key: {{ .Values.controller.db.key }}
  ca.crt: {{ .Values.controller.db.ca }}
{{- end }}
{{- end }}

Modify _helpers.tpl to add reference to certs :

[...]
  connection_url = "{{ .Values.controller.db.connectionUrl }}"
{{- if .Values.controller.db.tls }}
  ca_certificate = "/config/ssl/db/ca.crt"
  client_certificate = "/config/ssl/db/tls.crt"
  client_key_pkcs8_pem = "/config/ssl/db/tls.key"
{{- end }}
{{- if .Values.controller.db.etcdPrefix }}
  [db.etcd]
  prefix = "{{ .Values.controller.db.etcdPrefix }}"
{{- end }}
[...]

Modify controller-deployment.yaml :

{{- $fullName := include "linstor.fullname" . -}}
{{- if .Values.controller.enabled }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: {{ $fullName }}-controller
  name: {{ $fullName }}-controller
  namespace: {{ .Release.Namespace }}
[...]
      initContainers:
      - name: load-certs
[...]
        command:
[...]
          {{- end }}
          {{- if .Values.controller.db.tls }}
          cp -rf /tls/db /config/ssl/db
          {{- end }}
          rm -f "$tmp"
        volumeMounts:
        {{- if .Values.controller.db.tls }}
        - name: db-tls
          mountPath: /tls/db
        {{- end }}
[...]
      volumes:
      {{- if .Values.controller.db.tls }}
      - name: db-tls
        secret:
          secretName: {{ $fullName }}-db-tls
      {{- end }}
[...]

Stork not scheduling pods with volumes

Hello,

I have a small cluster of VMs (3) configured with stork enabled. I have deployments with only one replica for linstor-linstor-stork and linstor-linstor-stork-scheduler, both running on the master node. I took care to align version of stork-scheduler aligned with my K8s version (1.18.3 - Would be ice to add a comment on the kube-linstor/examples/linstor.yaml).
However, when running a pod mounting a linstor pvc, the node is scheduled on a node that doesn't hold a replica.
In the logs for the stork-scheduler, I have plenty of:
E0612 15:15:38.963174 1 leaderelection.go:320] error retrieving resource lock kube-system/stork-scheduler: leases.coordination.k8s.io "stork-scheduler" is forbidden: User "system:serviceaccount:linstor:linstor-linstor-stork-scheduler" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-system"
I guess there is missing an authorization here...

Cheers

Missing csi image

Hello !
docker.io/kvaps/linstor-csi:v1.7.1-2 and docker.io/kvaps/linstor-csi:v1.7.1-1 are missing from docker repositories
It breaks the 1.17.1-+ release.
Regards,

Error: failed to download "kvaps/kube-linstor"

Hello, tried to follow the read.me. Everything is working except the following:

helm install linstor01 kvaps/kube-linstor --version 1.7.2
--namespace linstor
-f linstor.yaml

Error: failed to download "kvaps/kube-linstor" (hint: running helm repo update may help)

drbd.conf created as dir instead of file?

linstor-satellite-5b5mz
linstor
192.168.1.143
Waiting: PodInitializing
0
52 minutes
Error: failed to start container "init": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused "rootfs_linux.go:58: mounting \"/etc/drbd.conf\" to rootfs \"/var/lib/docker/overlay2/8f1475873fe30288ca159ba41aceff5a982cc8e99d0b2362d065c195b53d559b/merged\" at \"/var/lib/docker/overlay2/8f1475873fe30288ca159ba41aceff5a982cc8e99d0b2362d065c195b53d559b/merged/etc/drbd.conf\" caused \"not a directory\""": unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
Back-off restarting failed container

How can I use this in k8s

Hi,

I have deployed kube-linstor on k8s followed by README, and configured storage-pool and resource-group. Then how can I use it in k8s?

Unable to install stolon via Helm in Kubernetes 1.22

Hello,

I do get an error when applying:

# download example values
curl -LO https://github.com/kvaps/kube-linstor/raw/v1.14.0/examples/linstor-db.yaml

# install release
helm install linstor-db kvaps/stolon \
  --namespace linstor \
  -f linstor-db.yaml

The error given is:

Error: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Role" in version "rbac.authorization.k8s.io/v1beta1", unable to recognize "": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"]

This has probably to do with the new API versions of Kubernetes. Could you please take a look at it?

Automatically join nodes

Hi @kvaps,
Is there a way to use LinstorSatelliteSet custom resource with your charts ?
My goal is to create pools automatically.

Arm64 support?

Is this supposed to work on arm64 as well? I'm trying to install on a few Pi 4 8GB's with Fedora Core OS and k8s 1.21.

linstor       linstor-db-stolon-create-cluster-x7n5m       0/1     CrashLoopBack             Off   3          2m54s
linstor       linstor-db-stolon-keeper-0                   0/1     Pending                         0          2m54s
linstor       linstor-db-stolon-proxy-7496f59b69-kjm4d     0/1     CrashLoopBack             Off   3          2m54s
linstor       linstor-db-stolon-proxy-7496f59b69-nm2x5     0/1     CrashLoopBack             Off   3          2m53s
linstor       linstor-db-stolon-proxy-7496f59b69-nrvlt     0/1     CrashLoopBack             Off   4          2m54s
linstor       linstor-db-stolon-sentinel-c8875bb99-9wgpw   0/1     CrashLoopBack             Off   3          2m54s
linstor       linstor-db-stolon-sentinel-c8875bb99-shdcr   0/1     Error                           4          2m54s
linstor       linstor-db-stolon-sentinel-c8875bb99-vbbpg   0/1     Error                           4          2m54s

The satellite does not support the device provider LVM_THIN

Is thin provisioning supported?

Tried:
linstor storage-pool create lvmthin k8w1 linstor-pool vg/lvmthinpool

and got:
ERROR:
The satellite does not support the device provider LVM_THIN

Kuberenetes node does support it. I tried with local linstor command.

linstor 1.12.2 K8s 1.21 : cannot create pv

Hello,
After upgrading to linstor 1.12, I can't create new pv.
Here are some interesting logs from linstor-controller :

INFO: [HttpServer-1] Started.
11:35:37.611 [Main] INFO  LINSTOR/Controller - SYSTEM - Controller initialized
11:35:42.445 [grizzly-http-server-1] ERROR LINSTOR/Controller - SYSTEM - Could not set object '[]' of type String as SQL type: 2005 (CLOB) for column RESOURCE_GROUPS.NODE_NAME_LIST [Report number 609282FD-00000-000000]
11:35:42.640 [grizzly-http-server-0] ERROR LINSTOR/Controller - SYSTEM - Could not set object '[]' of type String as SQL type: 2005 (CLOB) for column RESOURCE_GROUPS.NODE_NAME_LIST [Report number 609282FD-00000-000001]
11:35:42.717 [grizzly-http-server-1] WARN  LINSTOR/Controller - SYSTEM - Path '/v1/resource-definitions//resources' not found on server.
11:35:42.756 [grizzly-http-server-0] WARN  LINSTOR/Controller - SYSTEM - Path '/v1/resource-definitions//resources' not found on server.
11:35:53.460 [grizzly-http-server-1] ERROR LINSTOR/Controller - SYSTEM - Could not set object '[]' of type String as SQL type: 2005 (CLOB) for column RESOURCE_GROUPS.NODE_NAME_LIST [Report number 609282FD-00000-000002]
11:35:53.545 [grizzly-http-server-0] WARN  LINSTOR/Controller - SYSTEM - Path '/v1/resource-definitions//resources' not found on server.
11:35:59.185 [grizzly-http-server-1] ERROR LINSTOR/Controller - SYSTEM - Could not set object '[]' of type String as SQL type: 2005 (CLOB) for column RESOURCE_GROUPS.NODE_NAME_LIST [Report number 609282FD-00000-000003]
11:35:59.252 [grizzly-http-server-0] ERROR LINSTOR/Controller - SYSTEM - Could not set object '[]' of type String as SQL type: 2005 (CLOB) for column RESOURCE_GROUPS.NODE_NAME_LIST [Report number 609282FD-00000-000004]
...

What is weird is that I have no problem exploring existing resources using the linstor cli, or creating new ones...
However, I can't create a new ressource group :

$ linstor rg c test
ERROR:
Description:
    Creation of resource group 'test' failed due to an unknown exception.
Details:
    Resource group: test
Show reports:
    linstor error-reports show 609282FD-00000-000042
command terminated with exit code 10

Any hint?

Cheers

Updates

Hello,
Is there a plan to integrate latest versions of Linstor ? So far, the chart and available images are 1.14 while 1.17 is already there.
I've got the feeling that Piraeus is more on focus now. In case this chart has no plan to keep up with latest versions, is there any migration guide to Piraeus (including for those who are using Postgres) ?
Anyway, thanks for this piece of work !
Cheers !

function "mustToJson" not defined on 1.11.1-3

Hello, I tried to update/ fresh install the 1.11.1-3 and i´m facing this error:

Wait helm template failed. Error: render error in "linstor/templates/configurator-configmap.yaml": template: linstor/templates/configurator-configmap.yaml:12:8: executing "linstor/templates/configurator-configmap.yaml" at <tpl (.Files.Get "scripts/configurator.controller") .>: error calling tpl: Error during tpl function execution for "#!/bin/bash\nset -e\n. $(dirname $0)/functions.sh\n\nload_controller_params\nwait_controller\n\n{{- with .Values.configurator.controller }}\n{{- with .props }}\nconfigure_controller_props {{ mustToJson . | quote }}\n{{- end }}\n\n{{- $selectFilter := dict }}\n{{- range .resourceGroups }}\n{{- range $k, $v := .selectFilter }}\n{{- $_ := set $selectFilter (snakecase $k) $v }}\n{{- end }}\nconfigure_resource_group {{ required \"A valid .Values.configurator.controller.resourceGroups[].name entry required!\" .name | quote }} {{ mustToJson $selectFilter | quote }} {{ mustToJson (.props | default (dict)) | quote }}\n{{- $rg_name := .name }}\n{{- range .volumeGroups }}\nconfigure_volume_group {{ $rg_name | quote }} {{ required \"A valid .Values.configurator.controller.resourceGroups[].volumeGroups[].volumeNumber entry required!\" .volumeNumber | quote }} {{ mustToJson (.props | default (dict)) | quote }}\n{{- end }}\n{{- end }}\n{{- end }}\n\nfinish\n": parse error in "linstor/templates/configurator-configmap.yaml": template: linstor/templates/configurator-configmap.yaml:10: function "mustToJson" not defined : exit status 1; waiting on helm-controller_c-8h99v

Previous versions works good!

I´m on Kubernets 1.20.6

Thanks for this awesome project!!

Names changed in v1.9.0 helm chart

Hello there!

I'm not sure whether it's intentional or not, but some names changed on the helm chart, e.g. linstor-linstor-controller became linstor-controller. I've upgraded using command helm -n linstor upgrade linstor kvaps/linstor -f linstor.yaml.

This implies changing some script on my side (alias linstor='kubectl exec -n linstor linstor-controller-0 -c linstor-controller -- linstor'), but it also prevents linstor starting up (linstor-csi-node, linstor-satellite, linstor-controller and linstor-csi-controller cannot find secret linstor-client-tls). I managed to start my cluster again by changing linstor-client-tls to the existing linstor-linstor-client-tls secret refernced in the above-mentioned damonsets/statefulset.

Regards!

Design best practices for production

Hi @kvaps,

I'm wondering how to deploy linstor with packages on a new cluster.
I have 2 masters/controllers and 3 workers nodes.

In your examples, you tolerate to deploy on masters. Is this a best practice for production ?
Can you describe the best way to do it ?

Thanks for your great work.

image "ghcr.io/kvaps/linstor-stork:v1.12.5" not found

Trying out this project, as it ought to be simpler in comparison with piraeus,

looks like values.yml version of stork should be v1.12.3 to me

thanks,

Satellite does not support the following layers: [DRBD]

I set up according to the README (I also have drbd installed and the kernel module loaded on all of my nodes). I am currently received the following error when I attempt to provision a PVC:

linstor.csi.linbit.com_linstor-linstor-csi-controller-0_10eda589-b1f3-4c8d-b419-c3ec8ff6a76a failed to provision volume with StorageClass "linstor": rpc error: code = Internal desc = CreateVolume failed for pvc-77571758-dd40-41e9-92ed-0d6be92af48f: Message: 'Satellite 'r001n02' does not support the following layers: [DRBD]'; Details: 'Auto-placing resource: pvc-77571758-dd40-41e9-92ed-0d6be92af48f'; Reports: '[5F1B8A25-00000-000010]'

The local nodename also matches up in Linstor as well. I'm sure I'm just doing something stupid, but would appreciate some debug/resolutions steps I could take :)

satellites turn into 'offline' frequently

Hi, I deployed a cluster and run for a period of time. I usually found some of the resources became 'unknown' state. Then I checked the satellites and the satellites were in 'offline' state. It recovers after I restore the satellites .
Does the satellites unstable or there is anything I missed? I use version 1.13.0.

Add selector in linstor-csi-controller StatefulSet if you're using apiVersion: apps/v1

kube-linstor/helm/kube-linstor/templates/linstor-csi.yaml

Line 378 in 32e8dba

replicas: 1

Add this:

  selector:
    matchLabels:
      app: {{ template "linstor.fullname" . }}-csi-controller
      role: linstor-csi

to fix "error validating data: ValidationError(StatefulSet.spec): missing required field "selector" in io.k8s.api.apps.v1.StatefulSetSpec"

Mistakes in README.md

Template kube-linstor chart, and apply it:

    cd helm

    helm template kube-linstor \
      --namespace linstor \
      --set controller.db.user=linstor \
      --set controller.db.password=hackme \
      --set controller.db.controller.db.connectionUrl=jdbc:postgresql://linstor-db-stolon-proxy/linstor \
      --set controller.nodeSelector.node-role\\.kubernetes\\.io/master= \
      --set controller.tolerations[0].effect=NoSchedule,controller.tolerations[0].key=node-role.kubernetes.io/master \
      --set satellite.tolerations[0].effect=NoSchedule,satellite.tolerations[0].key=node-role.kubernetes.io/master \
      --set csi.controller.nodeSelector.node-role\\.kubernetes\\.io/master= \
      --set csi.controller.tolerations[0].effect=NoSchedule,csi.controller.tolerations[0].key=node-role.kubernetes.io/master \
      --set csi.node.tolerations[0].effect=NoSchedule,csi.node.tolerations[0].key=node-role.kubernetes.io/master \
      > linstor.yaml

    kubectl create -f linstor-db.yaml -n linstor

Wrong filename: linstor.yaml instead of linstor-db.yaml
Usage of "-n linstor" is not correct, because of errors:

kubectl create -f linstor.yaml -n linstor
configmap/linstor-satellite created
daemonset.apps/linstor-satellite created
configmap/linstor-stunnel created
secret/linstor-stunnel created
service/linstor-stunnel created
statefulset.apps/linstor-controller created
secret/linstor-controller created
the namespace from the provided object "kube-system" does not match the namespace "linstor". You must pass '--namespace=kube-system' to perform this operation.
the namespace from the provided object "kube-system" does not match the namespace "linstor". You must pass '--namespace=kube-system' to perform this operation.
... etc.

Comparison and/or top level overview of the drbd-on-k8s ecosystem

Thanks for all your work on this project!

I've been recently doing some evaluations on the storage plugin ecosystem for k8s and came across LINSTOR, this project, and piraeus plugins that use and extend drbd9. I'd love some documentation on the differences in approach/implementation between these -- a small section in the README would be great.

Personally I'm just getting up to speed on how drbd9 works and what the options do and how well it works with Kubernetes -- up until now I've only looked at Ceph (via Rook), Longhorn (via OpenEBS Jiva), OpenEBS cStor and OpenEBS Mayastor. LINSTOR looks like an excellent alternative, but I'm a bit unclear as to the difference between the three project (this project, LINBIT/linstor-server and piraeusdatastore/piraeus-operator)

CSI Plugin Not Registering on Nodes

This is my first time working with DRBD and Linstor. Following the README, I can get most everything up and running. However, the CSI plugin fails to register on the nodes. I can provision PVCs from Kubernetes but pods won't attach to them, which is strange.

Here are logs from the csi-node pod:

I0108 15:00:47.982707       1 main.go:113] Version: v2.2.0
I0108 15:00:47.983302       1 main.go:137] Attempting to open a gRPC connection with: "/csi/csi.sock"
I0108 15:00:47.983328       1 connection.go:153] Connecting to unix:///csi/csi.sock
I0108 15:00:47.983742       1 main.go:144] Calling CSI driver to discover driver name
I0108 15:00:47.983775       1 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginInfo
I0108 15:00:47.983787       1 connection.go:183] GRPC request: {}
I0108 15:00:47.987578       1 connection.go:185] GRPC response: {"name":"linstor.csi.linbit.com","vendor_version":"v0.13.1"}
I0108 15:00:47.987648       1 connection.go:186] GRPC error: <nil>
I0108 15:00:47.987657       1 main.go:154] CSI driver name: "linstor.csi.linbit.com"
I0108 15:00:47.987714       1 node_register.go:52] Starting Registration Server at: /registration/linstor.csi.linbit.com-reg.sock
I0108 15:00:47.987884       1 node_register.go:61] Registration Server started at: /registration/linstor.csi.linbit.com-reg.sock
I0108 15:00:47.987938       1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: ""
I0108 15:00:49.379501       1 main.go:80] Received GetInfo call: &InfoRequest{}
I0108 15:00:49.412651       1 main.go:90] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:false,Error:RegisterPlugin error -- plugin registration failed with err: rpc error: code = Unknown desc = failed to retrieve node topology: failed to get storage pools for node: 404 Not Found,}
E0108 15:00:49.412696       1 main.go:92] Registration process failed with error: RegisterPlugin error -- plugin registration failed with err: rpc error: code = Unknown desc = failed to retrieve node topology: failed to get storage pools for node: 404 Not Found, restarting registration container.

Here is a message from a pod attempting to mount a provisioned PVC:

AttachVolume.Attach failed for volume "pvc-de8fe93c-7431-4e62-981f-49b2541a95de" : CSINode rke1w does not contain driver linstor.csi.linbit.com

However, I can plainly see the storage was provisioned:

What might I be doing wrong? When installing from helm I'm pretty much using your example values.

Thanks.

kvaps / kube-linstor Goto Github PK

kube-linstor's People

Stargazers

Watchers

Forkers

kube-linstor's Issues

Recommend Projects

Recommend Topics

Recommend Org