Question Environment Lon

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

logs longhorn-manager-9tnbl : <div class="snippe

[QUESTION] longhorn-manager pods in CrashLoopBackOff state and longhorn-uninstall pods in Error state,about longhorn/longhorn

Comments (10)

derekbit commented on September 27, 2024

Do you see error log messages in the longhorn-uninstall pod?

from longhorn.

raphael10-collab commented on September 27, 2024

Hi @derekbit

Yesterday I had issues also during the un-installation process, and I had to clean everything up through these steps:
https://longhorn.io/docs/1.6.1/deploy/uninstall/#uninstalling-longhorn-using-kubectl

Do you want me to try again?

from longhorn.

derekbit commented on September 27, 2024

Yes, if you are available. Thank you.

from longhorn.

raphael10-collab commented on September 27, 2024

root@k8s-eu-1-control-plane-node-1:~# helm repo add longhorn https://charts.longhorn.io
"longhorn" already exists with the same configuration, skipping
root@k8s-eu-1-control-plane-node-1:~# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "longhorn" chart repository
...Successfully got an update from the "cilium" chart repository
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈Happy Helming!⎈
root@k8s-eu-1-control-plane-node-1:~# helm repo ls
NAME                	URL                                               
cilium              	https://helm.cilium.io/                           
prometheus-community	https://prometheus-community.github.io/helm-charts

Installed :
root@k8s-eu-1-control-plane-node-1:~# helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace --version 1.6.1
NAME: longhorn
LAST DEPLOYED: Fri Apr 12 10:22:14 2024
NAMESPACE: longhorn-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Longhorn is now installed on the cluster!

Please wait a few minutes for other Longhorn components such as CSI deployments, Engine Images, and Instance Managers to be initialized.

Visit our documentation at https://longhorn.io/docs/

Pods in CrashLoopBackOff:

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system get pod
NAME                                     READY   STATUS             RESTARTS      AGE
longhorn-driver-deployer-9f4fc86-tmjsf   0/1     Init:0/1           0             10s
longhorn-manager-9tnbl                   0/1     CrashLoopBackOff   1 (7s ago)    10s
longhorn-manager-j68jb                   0/1     CrashLoopBackOff   1 (14h ago)   10s
longhorn-manager-ltqlh                   0/1     CrashLoopBackOff   1 (6s ago)    10s
longhorn-manager-ph4lq                   0/1     CrashLoopBackOff   1 (7s ago)    10s
longhorn-manager-xm94d                   0/1     CrashLoopBackOff   1 (6s ago)    10s
longhorn-ui-5f4b7bbf69-6f9gk             1/1     Running            0             10s
longhorn-ui-5f4b7bbf69-s5xf8             1/1     Running            0             10s

pod longhorn-manager-9tnbl in CrashLoopBackOff state :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system describe pod longhorn-manager-9tnbl
Name:                 longhorn-manager-9tnbl
Namespace:            longhorn-system
Priority:             1000000000
Priority Class Name:  longhorn-critical
Service Account:      longhorn-service-account
Node:                 k8s-eu-1-worker-1/38.242.249.60
Start Time:           Fri, 12 Apr 2024 10:22:16 +0200
Labels:               app=longhorn-manager
                      app.kubernetes.io/instance=longhorn
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=longhorn
                      app.kubernetes.io/version=v1.6.1
                      controller-revision-hash=758dbbd69b
                      helm.sh/chart=longhorn-1.6.1
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.0.3.97
IPs:
  IP:           10.0.3.97
Controlled By:  DaemonSet/longhorn-manager
Containers:
  longhorn-manager:
    Container ID:  cri-o://73872dfefe63223a39e895accd5af424839bd9e16b5767409e4520a6c705d2a7
    Image:         longhornio/longhorn-manager:v1.6.1
    Image ID:      docker.io/longhornio/longhorn-manager@sha256:69a4dde599ab9fb2e506ea2eeed4f939b8c282fd7a073f8ba2ddccf3f844089e
    Ports:         9500/TCP, 9501/TCP, 9502/TCP, 9503/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      longhorn-manager
      -d
      daemon
      --engine-image
      longhornio/longhorn-engine:v1.6.1
      --instance-manager-image
      longhornio/longhorn-instance-manager:v1.6.1
      --share-manager-image
      longhornio/longhorn-share-manager:v1.6.1
      --backing-image-manager-image
      longhornio/backing-image-manager:v1.6.1
      --support-bundle-manager-image
      longhornio/support-bundle-kit:v0.0.36
      --manager-image
      longhornio/longhorn-manager:v1.6.1
      --service-account
      longhorn-service-account
      --upgrade-version-check
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 12 Apr 2024 10:22:59 +0200
      Finished:     Fri, 12 Apr 2024 10:22:59 +0200
    Ready:          False
    Restart Count:  3
    Readiness:      http-get https://:9501/v1/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  longhorn-system (v1:metadata.namespace)
      POD_IP:          (v1:status.podIP)
      NODE_NAME:       (v1:spec.nodeName)
    Mounts:
      /host/dev/ from dev (rw)
      /host/proc/ from proc (rw)
      /tls-files/ from longhorn-grpc-tls (rw)
      /var/lib/longhorn/ from longhorn (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-st66w (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/
    HostPathType:  
  proc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/
    HostPathType:  
  longhorn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/longhorn/
    HostPathType:  
  longhorn-grpc-tls:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  longhorn-grpc-tls
    Optional:    true
  kube-api-access-st66w:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  52s               default-scheduler  Successfully assigned longhorn-system/longhorn-manager-9tnbl to k8s-eu-1-worker-1
  Warning  Unhealthy  51s               kubelet            Readiness probe failed: Get "https://10.0.3.97:9501/v1/healthz": dial tcp 10.0.3.97:9501: connect: connection refused
  Normal   Pulled     9s (x4 over 51s)  kubelet            Container image "longhornio/longhorn-manager:v1.6.1" already present on machine
  Normal   Created    9s (x4 over 51s)  kubelet            Created container longhorn-manager
  Normal   Started    9s (x4 over 51s)  kubelet            Started container longhorn-manager
  Warning  BackOff    2s (x7 over 49s)  kubelet            Back-off restarting failed container longhorn-manager in pod longhorn-manager-9tnbl_longhorn-system(c6201bc6-ba40-4a28-8e2e-e5d7447eeea7)

pod longhorn-manager-j68jb in CrashLoopBackOff state :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system describe pod longhorn-manager-j68jb
Name:                 longhorn-manager-j68jb
Namespace:            longhorn-system
Priority:             1000000000
Priority Class Name:  longhorn-critical
Service Account:      longhorn-service-account
Node:                 k8s-eu-1-worker-2/38.242.249.121
Start Time:           Thu, 11 Apr 2024 19:24:11 +0200
Labels:               app=longhorn-manager
                      app.kubernetes.io/instance=longhorn
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=longhorn
                      app.kubernetes.io/version=v1.6.1
                      controller-revision-hash=758dbbd69b
                      helm.sh/chart=longhorn-1.6.1
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.0.4.100
IPs:
  IP:           10.0.4.100
Controlled By:  DaemonSet/longhorn-manager
Containers:
  longhorn-manager:
    Container ID:  cri-o://e709bebf4c15117fbb9f7e79a06e3a584109dfb0b9c49b445985121973e1de4b
    Image:         longhornio/longhorn-manager:v1.6.1
    Image ID:      docker.io/longhornio/longhorn-manager@sha256:69a4dde599ab9fb2e506ea2eeed4f939b8c282fd7a073f8ba2ddccf3f844089e
    Ports:         9500/TCP, 9501/TCP, 9502/TCP, 9503/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      longhorn-manager
      -d
      daemon
      --engine-image
      longhornio/longhorn-engine:v1.6.1
      --instance-manager-image
      longhornio/longhorn-instance-manager:v1.6.1
      --share-manager-image
      longhornio/longhorn-share-manager:v1.6.1
      --backing-image-manager-image
      longhornio/backing-image-manager:v1.6.1
      --support-bundle-manager-image
      longhornio/support-bundle-kit:v0.0.36
      --manager-image
      longhornio/longhorn-manager:v1.6.1
      --service-account
      longhorn-service-account
      --upgrade-version-check
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 12 Apr 2024 10:24:47 +0200
      Finished:     Fri, 12 Apr 2024 10:24:47 +0200
    Ready:          False
    Restart Count:  5
    Readiness:      http-get https://:9501/v1/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  longhorn-system (v1:metadata.namespace)
      POD_IP:          (v1:status.podIP)
      NODE_NAME:       (v1:spec.nodeName)
    Mounts:
      /host/dev/ from dev (rw)
      /host/proc/ from proc (rw)
      /tls-files/ from longhorn-grpc-tls (rw)
      /var/lib/longhorn/ from longhorn (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zbfld (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/
    HostPathType:  
  proc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/
    HostPathType:  
  longhorn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/longhorn/
    HostPathType:  
  longhorn-grpc-tls:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  longhorn-grpc-tls
    Optional:    true
  kube-api-access-zbfld:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Warning  Unhealthy  15h                kubelet            Readiness probe failed: Get "https://10.0.4.100:9501/v1/healthz": dial tcp 10.0.4.100:9501: connect: connection refused
  Warning  BackOff    14h (x9 over 15h)  kubelet            Back-off restarting failed container longhorn-manager in pod longhorn-manager-j68jb_longhorn-system(dd9342ad-d5f9-4308-9b8c-18518e8f6d62)
  Normal   Pulled     14h (x5 over 15h)  kubelet            Container image "longhornio/longhorn-manager:v1.6.1" already present on machine
  Normal   Created    14h (x5 over 15h)  kubelet            Created container longhorn-manager
  Normal   Started    14h (x5 over 15h)  kubelet            Started container longhorn-manager
  Normal   Scheduled  2m41s              default-scheduler  Successfully assigned longhorn-system/longhorn-manager-j68jb to k8s-eu-1-worker-2

pod longhorn-manager-ltqlh in CrashLoopBackOff state :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system describe pod longhorn-manager-ltqlh
Name:                 longhorn-manager-ltqlh
Namespace:            longhorn-system
Priority:             1000000000
Priority Class Name:  longhorn-critical
Service Account:      longhorn-service-account
Node:                 k8s-eu-1-worker-4/38.242.249.255
Start Time:           Fri, 12 Apr 2024 10:22:16 +0200
Labels:               app=longhorn-manager
                      app.kubernetes.io/instance=longhorn
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=longhorn
                      app.kubernetes.io/version=v1.6.1
                      controller-revision-hash=758dbbd69b
                      helm.sh/chart=longhorn-1.6.1
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.0.6.161
IPs:
  IP:           10.0.6.161
Controlled By:  DaemonSet/longhorn-manager
Containers:
  longhorn-manager:
    Container ID:  cri-o://6d3d10b942ad1532acba5c5d29a8517f5e40bde07b6e3c95b3015f4a53473ce6
    Image:         longhornio/longhorn-manager:v1.6.1
    Image ID:      docker.io/longhornio/longhorn-manager@sha256:69a4dde599ab9fb2e506ea2eeed4f939b8c282fd7a073f8ba2ddccf3f844089e
    Ports:         9500/TCP, 9501/TCP, 9502/TCP, 9503/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      longhorn-manager
      -d
      daemon
      --engine-image
      longhornio/longhorn-engine:v1.6.1
      --instance-manager-image
      longhornio/longhorn-instance-manager:v1.6.1
      --share-manager-image
      longhornio/longhorn-share-manager:v1.6.1
      --backing-image-manager-image
      longhornio/backing-image-manager:v1.6.1
      --support-bundle-manager-image
      longhornio/support-bundle-kit:v0.0.36
      --manager-image
      longhornio/longhorn-manager:v1.6.1
      --service-account
      longhorn-service-account
      --upgrade-version-check
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 12 Apr 2024 10:25:32 +0200
      Finished:     Fri, 12 Apr 2024 10:25:33 +0200
    Ready:          False
    Restart Count:  5
    Readiness:      http-get https://:9501/v1/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  longhorn-system (v1:metadata.namespace)
      POD_IP:          (v1:status.podIP)
      NODE_NAME:       (v1:spec.nodeName)
    Mounts:
      /host/dev/ from dev (rw)
      /host/proc/ from proc (rw)
      /tls-files/ from longhorn-grpc-tls (rw)
      /var/lib/longhorn/ from longhorn (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rzg2z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/
    HostPathType:  
  proc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/
    HostPathType:  
  longhorn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/longhorn/
    HostPathType:  
  longhorn-grpc-tls:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  longhorn-grpc-tls
    Optional:    true
  kube-api-access-rzg2z:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  5m45s                  default-scheduler  Successfully assigned longhorn-system/longhorn-manager-ltqlh to k8s-eu-1-worker-4
  Warning  Unhealthy  5m42s (x2 over 5m43s)  kubelet            Readiness probe failed: Get "https://10.0.6.161:9501/v1/healthz": dial tcp 10.0.6.161:9501: connect: connection refused
  Normal   Created    4m47s (x4 over 5m44s)  kubelet            Created container longhorn-manager
  Normal   Started    4m47s (x4 over 5m44s)  kubelet            Started container longhorn-manager
  Normal   Pulled     3m53s (x5 over 5m44s)  kubelet            Container image "longhornio/longhorn-manager:v1.6.1" already present on machine
  Warning  BackOff    43s (x27 over 5m40s)   kubelet            Back-off restarting failed container longhorn-manager in pod longhorn-manager-ltqlh_longhorn-system(f449c247-4103-4416-a50a-e6b1a004a10a)

pod longhorn-manager-ph4lq in CrashLoopBackOff state :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system describe pod longhorn-manager-ph4lq
Name:                 longhorn-manager-ph4lq
Namespace:            longhorn-system
Priority:             1000000000
Priority Class Name:  longhorn-critical
Service Account:      longhorn-service-account
Node:                 k8s-eu-1-control-plane-node-1/38.242.250.38
Start Time:           Fri, 12 Apr 2024 10:22:16 +0200
Labels:               app=longhorn-manager
                      app.kubernetes.io/instance=longhorn
                      app.kubernetes.io/managed-by=Helm
                      app.kubernetes.io/name=longhorn
                      app.kubernetes.io/version=v1.6.1
                      controller-revision-hash=758dbbd69b
                      helm.sh/chart=longhorn-1.6.1
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.0.0.221
IPs:
  IP:           10.0.0.221
Controlled By:  DaemonSet/longhorn-manager
Containers:
  longhorn-manager:
    Container ID:  cri-o://6b7f8f622c66d1228368a5c3ed9ffee34ca843b6ccd058fdd8c6a3e4ea7a5416
    Image:         longhornio/longhorn-manager:v1.6.1
    Image ID:      docker.io/longhornio/longhorn-manager@sha256:69a4dde599ab9fb2e506ea2eeed4f939b8c282fd7a073f8ba2ddccf3f844089e
    Ports:         9500/TCP, 9501/TCP, 9502/TCP, 9503/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP, 0/TCP
    Command:
      longhorn-manager
      -d
      daemon
      --engine-image
      longhornio/longhorn-engine:v1.6.1
      --instance-manager-image
      longhornio/longhorn-instance-manager:v1.6.1
      --share-manager-image
      longhornio/longhorn-share-manager:v1.6.1
      --backing-image-manager-image
      longhornio/backing-image-manager:v1.6.1
      --support-bundle-manager-image
      longhornio/support-bundle-kit:v0.0.36
      --manager-image
      longhornio/longhorn-manager:v1.6.1
      --service-account
      longhorn-service-account
      --upgrade-version-check
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 12 Apr 2024 10:28:17 +0200
      Finished:     Fri, 12 Apr 2024 10:28:18 +0200
    Ready:          False
    Restart Count:  6
    Readiness:      http-get https://:9501/v1/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAMESPACE:  longhorn-system (v1:metadata.namespace)
      POD_IP:          (v1:status.podIP)
      NODE_NAME:       (v1:spec.nodeName)
    Mounts:
      /host/dev/ from dev (rw)
      /host/proc/ from proc (rw)
      /tls-files/ from longhorn-grpc-tls (rw)
      /var/lib/longhorn/ from longhorn (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-wcj4q (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  dev:
    Type:          HostPath (bare host directory volume)
    Path:          /dev/
    HostPathType:  
  proc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc/
    HostPathType:  
  longhorn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/longhorn/
    HostPathType:  
  longhorn-grpc-tls:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  longhorn-grpc-tls
    Optional:    true
  kube-api-access-wcj4q:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  6m55s                  default-scheduler  Successfully assigned longhorn-system/longhorn-manager-ph4lq to k8s-eu-1-control-plane-node-1
  Warning  Unhealthy  6m53s                  kubelet            Readiness probe failed: Get "https://10.0.0.221:9501/v1/healthz": dial tcp 10.0.0.221:9501: connect: connection refused
  Normal   Started    6m (x4 over 6m54s)     kubelet            Started container longhorn-manager
  Normal   Pulled     5m16s (x5 over 6m54s)  kubelet            Container image "longhornio/longhorn-manager:v1.6.1" already present on machine
  Normal   Created    5m16s (x5 over 6m54s)  kubelet            Created container longhorn-manager
  Warning  BackOff    106s (x29 over 6m51s)  kubelet            Back-off restarting failed container longhorn-manager in pod longhorn-manager-ph4lq_longhorn-system(acca9bba-a5de-4d7d-b20f-e92d5d22acd0)

from longhorn.

derekbit commented on September 27, 2024

Can you check the log messages of the longhorn-manager pods?
For example
kubectl -n longhorn-system logs longhorn-manager-9tnbl -f
kubectl -n longhorn-system logs longhorn-manager-9tnbl --previous

from longhorn.

raphael10-collab commented on September 27, 2024

logs longhorn-manager-9tnbl :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system logs longhorn-manager-9tnbl -f
time="2024-04-12T08:38:20Z" level=fatal msg="Error starting manager: Failed environment check, please make sure you have iscsiadm/open-iscsi installed on the host: failed to execute: /usr/bin/nsenter [nsenter --mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm --version], output , stderr nsenter: failed to execute iscsiadm: No such file or directory\n: exit status 127" func=main.main.DaemonCmd.func3 file="daemon.go:92"

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system logs longhorn-manager-9tnbl --previous
time="2024-04-12T08:38:20Z" level=fatal msg="Error starting manager: Failed environment check, please make sure you have iscsiadm/open-iscsi installed on the host: failed to execute: /usr/bin/nsenter [nsenter --mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net iscsiadm --version], output , stderr nsenter: failed to execute iscsiadm: No such file or directory\n: exit status 127" func=main.main.DaemonCmd.func3 file="daemon.go:92"


root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system logs longhorn-manager-j68jb -f
Error from server: Get "https://38.242.249.121:10250/containerLogs/longhorn-system/longhorn-manager-j68jb/longhorn-manager?follow=true": dial tcp 38.242.249.121:10250: i/o timeout

from longhorn.

derekbit commented on September 27, 2024

OK. You need to install iSCSI utility and some dependencies. Please see https://longhorn.io/docs/1.6.1/deploy/install/.

from longhorn.

raphael10-collab commented on September 27, 2024

I've installed iscsi tools in all the nodes apart from one, which I'm not able to connect to today, and for which I've contacted the Contabo Cloud Support Team

So, I guess, it will take a while for the whole process today

from longhorn.

derekbit commented on September 27, 2024

, which I'm not able to connect to today, and for which I've contacted the Contabo Cloud Support Team

So, I guess, it will take a while for the whole process today

OK! Take your time. Hope we can get good news soon! :)

from longhorn.

raphael10-collab commented on September 27, 2024

It seems that now all the requirements are fulfilled for all kubernetes nodes:

root@k8s-eu-1-control-plane-node-1:~# curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.6.1/scripts/environment_check.sh | bash
[INFO]  Required dependencies 'kubectl jq mktemp sort printf' are installed.
[INFO]  All nodes have unique hostnames.
[INFO]  Waiting for longhorn-environment-check pods to become ready (0/5)...
[INFO]  All longhorn-environment-check pods are ready (5/5).
[INFO]  MountPropagation is enabled
[INFO]  Checking kernel release...
[INFO]  Checking iscsid...
[INFO]  Checking multipathd...
[INFO]  Checking packages...
[INFO]  Checking nfs client...
[INFO]  Cleaning up longhorn-environment-check pods...
[INFO]  Cleanup completed.

root@k8s-eu-1-control-plane-node-1:~# kubectl version
Client Version: v1.28.8
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.8

I re-installed longhorn

root@k8s-eu-1-control-plane-node-1:~# helm repo add longhorn https://charts.longhorn.io
"longhorn" already exists with the same configuration, skipping
root@k8s-eu-1-control-plane-node-1:~# helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "longhorn" chart repository
...Successfully got an update from the "cilium" chart repository
...Successfully got an update from the "prometheus-community" chart repository
Update Complete. ⎈Happy Helming!⎈
root@k8s-eu-1-control-plane-node-1:~# helm repo ls
NAME                	URL                                               
cilium              	https://helm.cilium.io/                           
prometheus-community	https://prometheus-community.github.io/helm-charts
longhorn            	https://charts.longhorn.io 

root@k8s-eu-1-control-plane-node-1:~# helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace --version 1.6.1
NAME: longhorn
LAST DEPLOYED: Mon Apr 15 09:39:06 2024
NAMESPACE: longhorn-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Longhorn is now installed on the cluster!

Please wait a few minutes for other Longhorn components such as CSI deployments, Engine Images, and Instance Managers to be initialized.

Visit our documentation at https://longhorn.io/docs/

And now the pods are all in Running state :

root@k8s-eu-1-control-plane-node-1:~# kubectl -n longhorn-system get pod
NAME                                                READY   STATUS    RESTARTS   AGE
csi-attacher-5c4bfdcf59-bdggn                       1/1     Running   0          4m3s
csi-attacher-5c4bfdcf59-glr9c                       1/1     Running   0          4m3s
csi-attacher-5c4bfdcf59-wbn9d                       1/1     Running   0          4m3s
csi-provisioner-667796df57-bkxgf                    1/1     Running   0          4m3s
csi-provisioner-667796df57-g2d57                    1/1     Running   0          4m3s
csi-provisioner-667796df57-p99tm                    1/1     Running   0          4m3s
csi-resizer-694f8f5f64-48h6q                        1/1     Running   0          4m3s
csi-resizer-694f8f5f64-4wtt2                        1/1     Running   0          4m3s
csi-resizer-694f8f5f64-n8m6w                        1/1     Running   0          4m3s
csi-snapshotter-959b69d4b-8v2nz                     1/1     Running   0          4m3s
csi-snapshotter-959b69d4b-dxvvz                     1/1     Running   0          4m3s
csi-snapshotter-959b69d4b-rmwxg                     1/1     Running   0          4m3s
engine-image-ei-5cefaf2b-2r7td                      1/1     Running   0          4m10s
engine-image-ei-5cefaf2b-7v899                      1/1     Running   0          4m10s
engine-image-ei-5cefaf2b-b5fnm                      1/1     Running   0          4m10s
engine-image-ei-5cefaf2b-jdwd6                      1/1     Running   0          4m10s
engine-image-ei-5cefaf2b-tbwm2                      1/1     Running   0          4m10s
instance-manager-350386a73f401a602f1783938be991c0   1/1     Running   0          4m10s
instance-manager-4f65765c001f2b941bc5d728b65d4a32   1/1     Running   0          4m10s
instance-manager-51492326327041ac5956619303cd25e7   1/1     Running   0          4m6s
instance-manager-58ba023cfd09576814ec35835c1d718f   1/1     Running   0          4m10s
instance-manager-df0e119506168285bcc61fb5bbe710ae   1/1     Running   0          4m10s
longhorn-csi-plugin-8gcz2                           3/3     Running   0          4m3s
longhorn-csi-plugin-8gpgr                           3/3     Running   0          4m3s
longhorn-csi-plugin-jkshn                           3/3     Running   0          4m3s
longhorn-csi-plugin-lnr9q                           3/3     Running   0          4m3s
longhorn-csi-plugin-wshnd                           3/3     Running   0          4m3s
longhorn-driver-deployer-9f4fc86-w2x6f              1/1     Running   0          4m18s
longhorn-manager-9rr2c                              1/1     Running   0          4m18s
longhorn-manager-b4wk6                              1/1     Running   0          4m18s
longhorn-manager-b9b4r                              1/1     Running   0          4m18s
longhorn-manager-r7zgs                              1/1     Running   0          4m18s
longhorn-manager-wbgv2                              1/1     Running   0          4m18s
longhorn-ui-5f4b7bbf69-f6zsz                        1/1     Running   0          4m18s
longhorn-ui-5f4b7bbf69-r7m5w                        1/1     Running   0          4m17s

Thank you @derekbit for your kind help

from longhorn.

[QUESTION] longhorn-manager pods in CrashLoopBackOff state and longhorn-uninstall pods in Error state about longhorn HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent