Code Monkey home page Code Monkey logo

Comments (11)

derekbit avatar derekbit commented on September 23, 2024

Could you check the logs of the crashed longhorn-manager pods?

from longhorn.

n00bsi avatar n00bsi commented on September 23, 2024

@derekbit

# kubectl logs -n longhorn-system longhorn-manager-9w98z
time="2024-03-22T13:01:33Z" level=fatal msg="Error starting manager: Failed environment check, please make sure you have iscsiadm/open-iscsi installed on the host: failed to execute: nsenter [--mount=/host/proc/30419/ns/mnt --net=/host/proc/30419/ns/net iscsiadm --version], output , stderr nsenter: failed to execute iscsiadm: No such file or directory\n: exit status 127" func=main.main.DaemonCmd.func3 file="daemon.go:84"
# kubectl logs -n longhorn-system longhorn-manager-9w98z
time="2024-03-22T13:01:33Z" level=fatal msg="Error starting manager: Failed environment check, please make sure you have iscsiadm/open-iscsi installed on the host: failed to execute: nsenter [--mount=/host/proc/30419/ns/mnt --net=/host/proc/30419/ns/net iscsiadm --version], output , stderr nsenter: failed to execute iscsiadm: No such file or directory\n: exit status 127" func=main.main.DaemonCmd.func3 file="daemon.go:84"
[root@k8sn22 webdemo]# kubectl logs -n longhorn-system longhorn-ui-7c679b6d4d-d8j6v
nginx: [emerg] host not found in upstream "longhorn-backend" in /var/config/nginx/nginx.conf:32

# kubectl describe -n longhorn-system longhorn-driver-deployer-65f7bb646-5qx4t
error: the server doesn't have a resource type "longhorn-driver-deployer-65f7bb646-5qx4t"

from longhorn.

derekbit avatar derekbit commented on September 23, 2024

OK. According to the messages, you need to install iscsi utility and daemon first. You can check the documents of the distro.

from longhorn.

n00bsi avatar n00bsi commented on September 23, 2024

@derekbit

yum install iscsi-initiator-utils

systemctl enable iscsid.service && systemctl start iscsid.service

example: installed and started on all nodes

[root@k8sn22 ~]# ssh [email protected] systemctl status iscsid.service
[email protected]'s password: 
● iscsid.service - Open-iSCSI
     Loaded: loaded (/usr/lib/systemd/system/iscsid.service; disabled; preset: disabled)
     Active: active (running) since Mon 2024-03-25 13:02:59 CET; 4min 38s ago
TriggeredBy: ○ iscsid.socket
       Docs: man:iscsid(8)
             man:iscsiuio(8)
             man:iscsiadm(8)
   Main PID: 2203548 (iscsid)
     Status: "Ready to process requests"
      Tasks: 1 (limit: 11108)
     Memory: 4.4M
        CPU: 17ms
     CGroup: /system.slice/iscsid.service
             └─2203548 /usr/sbin/iscsid -f

Mar 25 13:02:59 k8sn27 systemd[1]: Starting Open-iSCSI...
Mar 25 13:02:59 k8sn27 systemd[1]: Started Open-iSCSI.

# rpm -qa | grep iscsi
iscsi-initiator-utils-iscsiuio-6.2.1.4-3.git2a8f9d8.el9.x86_64
iscsi-initiator-utils-6.2.1.4-3.git2a8f9d8.el9.x86_64



# kubectl get pods -n longhorn-system
NAME                                       READY   STATUS             RESTARTS      AGE
longhorn-driver-deployer-65f7bb646-j2jlc   0/1     Init:0/1           0             2m7s
longhorn-manager-4t7vl                     0/1     CrashLoopBackOff   4 (11s ago)   2m7s
longhorn-manager-mjlv4                     0/1     CrashLoopBackOff   4 (23s ago)   2m7s
longhorn-manager-w8kw6                     0/1     CrashLoopBackOff   4 (9s ago)    2m7s
longhorn-ui-7c679b6d4d-2qlkn               0/1     Error              3 (52s ago)   2m6s
longhorn-ui-7c679b6d4d-s2l6v               0/1     Error              3 (51s ago)   2m6s


# kubectl logs -n longhorn-system  longhorn-manager-mjlv4
W0325 12:08:38.675416       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0325 12:08:38.679029       1 shared_informer.go:311] Waiting for caches to sync for longhorn datastore
I0325 12:08:38.783207       1 shared_informer.go:318] Caches are synced for longhorn datastore
time="2024-03-25T12:08:38Z" level=info msg="Starting longhorn conversion webhook server" func=webhook.StartWebhook file="webhook.go:23"
time="2024-03-25T12:08:38Z" level=info msg="Waiting for conversion webhook to become ready" func=webhook.StartWebhook file="webhook.go:42"
time="2024-03-25T12:08:38Z" level=info msg="Active TLS secret longhorn-system/longhorn-webhook-tls (ver=619693) (count 2): map[listener.cattle.io/cn-longhorn-admission-webhook.longhor-59584d:longhorn-admission-webhook.longhorn-system.svc listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=1793FD59D7681D7D2D1AD1FFBC4582C1A9CC1941]" func="memory.(*memory).Update" file="memory.go:42"
time="2024-03-25T12:08:38Z" level=info msg="Listening on :9501" func=server.ListenAndServe.func2 file="server.go:77"
time="2024-03-25T12:08:38Z" level=info msg="Webhook conversion is ready" func=webhook.StartWebhook file="webhook.go:56"
time="2024-03-25T12:08:38Z" level=warning msg="Started longhorn conversion webhook server" func=webhook.StartWebhook file="webhook.go:74"
time="2024-03-25T12:08:38Z" level=info msg="Starting longhorn admission webhook server" func=webhook.StartWebhook file="webhook.go:23"
time="2024-03-25T12:08:38Z" level=info msg="Waiting for admission webhook to become ready" func=webhook.StartWebhook file="webhook.go:42"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for nodes.longhorn.io (Node)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for settings.longhorn.io (Setting)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for recurringjobs.longhorn.io (RecurringJob)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for backingimages.longhorn.io (BackingImage)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for volumes.longhorn.io (Volume)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for orphans.longhorn.io (Orphan)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for snapshots.longhorn.io (Snapshot)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for supportbundles.longhorn.io (SupportBundle)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for systembackups.longhorn.io (SystemBackup)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for systemrestores.longhorn.io (SystemRestore)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for volumeattachments.longhorn.io (VolumeAttachment)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for engines.longhorn.io (Engine)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add validaton handler for replicas.longhorn.io (Replica)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for backups.longhorn.io (Backup)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for backingimages.longhorn.io (BackingImage)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for backingimagemanagers.longhorn.io (BackingImageManager)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for backingimagedatasources.longhorn.io (BackingImageDataSource)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for nodes.longhorn.io (Node)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for volumes.longhorn.io (Volume)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for engines.longhorn.io (Engine)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for recurringjobs.longhorn.io (RecurringJob)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for engineimages.longhorn.io (EngineImage)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=warning msg="Failed to get webhook health endpoint https://localhost:9502/v1/healthz" func=webhook.StartWebhook file="webhook.go:54" error="Get \"https://localhost:9502/v1/healthz\": dial tcp [::1]:9502: connect: connection refused"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for orphans.longhorn.io (Orphan)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for sharemanagers.longhorn.io (ShareManager)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for backupvolumes.longhorn.io (BackupVolume)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for snapshots.longhorn.io (Snapshot)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for replicas.longhorn.io (Replica)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for supportbundles.longhorn.io (SupportBundle)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for systembackups.longhorn.io (SystemBackup)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Add mutation handler for volumeattachments.longhorn.io (VolumeAttachment)" func=server.addHandler file="handler.go:17"
time="2024-03-25T12:08:38Z" level=info msg="Listening on :9502" func=server.ListenAndServe.func2 file="server.go:77"
time="2024-03-25T12:08:38Z" level=info msg="Active TLS secret longhorn-system/longhorn-webhook-tls (ver=619693) (count 2): map[listener.cattle.io/cn-longhorn-admission-webhook.longhor-59584d:longhorn-admission-webhook.longhorn-system.svc listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=1793FD59D7681D7D2D1AD1FFBC4582C1A9CC1941]" func="memory.(*memory).Update" file="memory.go:42"
time="2024-03-25T12:08:39Z" level=info msg="Starting apiextensions.k8s.io/v1, Kind=CustomResourceDefinition controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-03-25T12:08:39Z" level=info msg="Starting /v1, Kind=Secret controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-03-25T12:08:39Z" level=info msg="Building conversion rules..." func="server.(*WebhookServer).runConversionWebhookListenAndServe.func1" file="server.go:195"
time="2024-03-25T12:08:39Z" level=info msg="Starting apiregistration.k8s.io/v1, Kind=APIService controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-03-25T12:08:39Z" level=info msg="Updating TLS secret for longhorn-system/longhorn-webhook-tls (count: 2): map[listener.cattle.io/cn-longhorn-admission-webhook.longhor-59584d:longhorn-admission-webhook.longhorn-system.svc listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=1793FD59D7681D7D2D1AD1FFBC4582C1A9CC1941]" func="kubernetes.(*storage).saveInK8s" file="controller.go:225"
time="2024-03-25T12:08:39Z" level=info msg="Updating TLS secret for longhorn-system/longhorn-webhook-tls (count: 2): map[listener.cattle.io/cn-longhorn-admission-webhook.longhor-59584d:longhorn-admission-webhook.longhorn-system.svc listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=1793FD59D7681D7D2D1AD1FFBC4582C1A9CC1941]" func="kubernetes.(*storage).saveInK8s" file="controller.go:225"
time="2024-03-25T12:08:39Z" level=info msg="Building validation rules..." func="server.(*WebhookServer).runAdmissionWebhookListenAndServe.func1" file="server.go:113"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:nodes Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010ba380 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:settings Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc00056d320 OperationTypes:[CREATE UPDATE DELETE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:recurringjobs Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000ade600 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backingimages Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000adea80 OperationTypes:[CREATE DELETE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:volumes Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bc900 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:orphans Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000a9edc0 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:snapshots Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010ba8c0 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:supportbundles Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bae00 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:systembackups Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bb340 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:systemrestores Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000a9f1e0 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:volumeattachments Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc001096500 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:engines Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010be600 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:replicas Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010ac840 OperationTypes:[CREATE UPDATE DELETE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=info msg="Building mutation rules..." func="server.(*WebhookServer).runAdmissionWebhookListenAndServe.func1" file="server.go:115"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backups Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010c0500 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backingimages Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000adef00 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backingimagemanagers Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bb880 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backingimagedatasources Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010c2000 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:nodes Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bbdc0 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:volumes Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bd680 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:engines Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010bef00 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:recurringjobs Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010c4000 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:engineimages Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc0010c63c0 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:orphans Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000db8840 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:sharemanagers Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000db8dc0 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:backupvolumes Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b36400 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:snapshots Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b38380 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:replicas Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b3a580 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:supportbundles Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b388c0 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:systembackups Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b38e00 OperationTypes:[CREATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="Add rule for {Name:volumeattachments Scope:Namespaced APIGroup:longhorn.io APIVersion:v1beta2 ObjectType:0xc000b1e500 OperationTypes:[CREATE UPDATE]}" func="server.(*WebhookServer).buildRules" file="server.go:249"
time="2024-03-25T12:08:39Z" level=debug msg="DesiredSet - No change(2) admissionregistration.k8s.io/v1, Kind=ValidatingWebhookConfiguration /longhorn-webhook-validator for  longhorn-system/longhorn-webhook-ca" func="apply.(*desiredSet).compareObjects" file="desiredset_compare.go:262"
time="2024-03-25T12:08:39Z" level=debug msg="DesiredSet - No change(2) admissionregistration.k8s.io/v1, Kind=MutatingWebhookConfiguration /longhorn-webhook-mutator for  longhorn-system/longhorn-webhook-ca" func="apply.(*desiredSet).compareObjects" file="desiredset_compare.go:262"
time="2024-03-25T12:08:40Z" level=info msg="Webhook admission is ready" func=webhook.StartWebhook file="webhook.go:56"
time="2024-03-25T12:08:40Z" level=warning msg="Started longhorn admission webhook server" func=webhook.StartWebhook file="webhook.go:74"
time="2024-03-25T12:08:40Z" level=info msg="Starting longhorn recovery-backend server" func=recovery_backend.StartRecoveryBackend file="recovery_backend.go:13"
time="2024-03-25T12:08:40Z" level=info msg="Started longhorn recovery-backend server" func=recovery_backend.StartRecoveryBackend file="recovery_backend.go:22"
W0325 12:08:40.804464       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2024-03-25T12:08:40Z" level=info msg="Recovery-backend server is running at :9503" func="server.(*RecoveryBackendServer).ListenAndServe" file="server.go:36"
time="2024-03-25T12:08:40Z" level=info msg="Waiting for old Longhorn manager pods to be fully removed" func=upgrade.waitForOldLonghornManagersToBeFullyRemoved file="upgrade.go:276"
I0325 12:08:40.835990       1 leaderelection.go:250] attempting to acquire leader lease longhorn-system/longhorn-manager-upgrade-lock...
I0325 12:08:40.852161       1 leaderelection.go:260] successfully acquired lease longhorn-system/longhorn-manager-upgrade-lock
time="2024-03-25T12:08:40Z" level=info msg="Start upgrading" func=upgrade.upgrade.func1 file="upgrade.go:130"
time="2024-03-25T12:08:40Z" level=info msg="setting default-engine-image not found" func=v1beta1.CanUpgrade file="upgrade.go:81"
time="2024-03-25T12:08:40Z" level=error msg="Upgrade failed: upgrade API version failed: cannot create CRDAPIVersionSetting: Internal error occurred: failed calling webhook \"validator.longhorn.io\": failed to call webhook: Post \"https://longhorn-admission-webhook.longhorn-system.svc:9502/v1/webhook/validaton?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority" func=upgrade.upgrade.func1.1 file="upgrade.go:125"
time="2024-03-25T12:08:40Z" level=info msg="Upgrade leader lost: k8sn27" func=upgrade.upgrade.func2 file="upgrade.go:139"
time="2024-03-25T12:08:40Z" level=fatal msg="Error starting manager: upgrade API version failed: cannot create CRDAPIVersionSetting: Internal error occurred: failed calling webhook \"validator.longhorn.io\": failed to call webhook: Post \"https://longhorn-admission-webhook.longhorn-system.svc:9502/v1/webhook/validaton?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority" func=main.main.DaemonCmd.func3 file="daemon.go:84"

# kubectl describe -n longhorn-system  longhorn-ui-7c679b6d4d-2qlkn
error: the server doesn't have a resource type "longhorn-ui-7c679b6d4d-2qlkn"


# kubectl describe -n longhorn-system  longhorn-driver-deployer-65f7bb646-j2jlc
error: the server doesn't have a resource type "longhorn-driver-deployer-65f7bb646-j2jlc"


from longhorn.

n00bsi avatar n00bsi commented on September 23, 2024
[root@k8sn22 ~]# kubectl get pod -l app=longhorn-manager -n longhorn-system
NAME                     READY   STATUS             RESTARTS      AGE
longhorn-manager-cpdh7   0/1     CrashLoopBackOff   5 (85s ago)   10m
longhorn-manager-rsk92   0/1     CrashLoopBackOff   5 (72s ago)   10m
longhorn-manager-tx4dk   0/1     CrashLoopBackOff   5 (74s ago)   10m
[root@k8sn22 ~]# kubectl logs longhorn-manager-rsk92 -n longhorn-syste
Error from server (NotFound): namespaces "longhorn-syste" not found
[root@k8sn22 ~]# kubectl logs longhorn-manager-rsk92 -n longhorn-system
W0419 04:44:50.483722       1 client_config.go:618] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0419 04:44:50.486829       1 shared_informer.go:311] Waiting for caches to sync for longhorn datastore
I0419 04:44:50.587331       1 shared_informer.go:318] Caches are synced for longhorn datastore
time="2024-04-19T04:44:50Z" level=info msg="Starting longhorn conversion webhook server" func=webhook.StartWebhook file="webhook.go:24"
time="2024-04-19T04:44:50Z" level=info msg="Waiting for conversion webhook to become ready" func=webhook.StartWebhook file="webhook.go:43"
time="2024-04-19T04:44:50Z" level=warning msg="Failed to check endpoint https://localhost:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://localhost:9501/v1/healthz\": dial tcp [::1]:9501: connect: connection refused"
time="2024-04-19T04:44:50Z" level=info msg="Active TLS secret longhorn-system/longhorn-webhook-tls (ver=182747) (count 1): map[listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=8E3C2417E88D8E36E73BFE5011757F19B5F77044]" func="memory.(*memory).Update" file="memory.go:42"
time="2024-04-19T04:44:50Z" level=info msg="Listening on :9501" func=server.ListenAndServe.func2 file="server.go:77"
time="2024-04-19T04:44:50Z" level=info msg="Starting apiregistration.k8s.io/v1, Kind=APIService controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-04-19T04:44:50Z" level=info msg="Starting apiextensions.k8s.io/v1, Kind=CustomResourceDefinition controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-04-19T04:44:50Z" level=info msg="Starting /v1, Kind=Secret controller" func="controller.(*controller).run" file="controller.go:144"
time="2024-04-19T04:44:50Z" level=info msg="Building conversion rules..." func="server.(*WebhookServer).runConversionWebhookListenAndServe.func1" file="server.go:195"
time="2024-04-19T04:44:50Z" level=info msg="Updating TLS secret for longhorn-system/longhorn-webhook-tls (count: 1): map[listener.cattle.io/cn-longhorn-conversion-webhook.longho-6a0089:longhorn-conversion-webhook.longhorn-system.svc listener.cattle.io/fingerprint:SHA1=8E3C2417E88D8E36E73BFE5011757F19B5F77044]" func="kubernetes.(*storage).saveInK8s" file="controller.go:225"
time="2024-04-19T04:44:52Z" level=info msg="Started longhorn conversion webhook server on localhost" func=webhook.StartWebhook file="webhook.go:47"
time="2024-04-19T04:44:53Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:44:56Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:44:59Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:02Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:05Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:08Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:11Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:14Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:17Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:20Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:23Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:26Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:29Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:32Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:35Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:38Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:41Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:44Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:47Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:50Z" level=warning msg="Failed to check endpoint https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz" func=webhook.isServiceAvailable file="webhook.go:78" error="Get \"https://longhorn-conversion-webhook.longhorn-system.svc:9501/v1/healthz\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
time="2024-04-19T04:45:52Z" level=fatal msg="Error starting manager: conversion webhook service is not accessible after 1m0s sec" func=main.main.DaemonCmd.func3 file="daemon.go:92"

[root@k8sn22 ~]# kubectl get service longhorn-backend -n longhorn-system
NAME               TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
longhorn-backend   ClusterIP   10.99.46.119   <none>        9500/TCP   14m
[root@k8sn22 ~]# kubectl get pod -l app=longhorn-manager -n longhorn-system
NAME                     READY   STATUS             RESTARTS      AGE
longhorn-manager-cpdh7   0/1     CrashLoopBackOff   6 (87s ago)   14m
longhorn-manager-rsk92   0/1     CrashLoopBackOff   6 (77s ago)   14m
longhorn-manager-tx4dk   0/1     CrashLoopBackOff   6 (77s ago)   14m


[root@k8sn22 ~]# kubectl get pod -n kube-system
NAME                                       READY   STATUS    RESTARTS       AGE
calico-kube-controllers-658d97c59c-ttxc4   1/1     Running   12 (19h ago)   36d
calico-node-5qzww                          1/1     Running   12 (19h ago)   36d
calico-node-669nj                          1/1     Running   12 (19h ago)   36d
calico-node-nj7jj                          1/1     Running   12 (19h ago)   36d
calico-node-sjjbc                          1/1     Running   12 (19h ago)   36d
calico-node-vq5jb                          1/1     Running   12 (19h ago)   36d
calico-node-xx9cl                          1/1     Running   12 (19h ago)   36d
coredns-76f75df574-98rm6                   1/1     Running   12 (19h ago)   36d
coredns-76f75df574-k5vgb                   1/1     Running   12 (19h ago)   36d
etcd-k8sn22                                1/1     Running   12 (19h ago)   36d
etcd-k8sn23                                1/1     Running   12 (19h ago)   36d
etcd-k8sn24                                1/1     Running   12 (19h ago)   36d
kube-apiserver-k8sn22                      1/1     Running   12 (19h ago)   36d
kube-apiserver-k8sn23                      1/1     Running   12 (19h ago)   36d
kube-apiserver-k8sn24                      1/1     Running   12 (19h ago)   36d
kube-controller-manager-k8sn22             1/1     Running   12 (19h ago)   36d
kube-controller-manager-k8sn23             1/1     Running   12 (19h ago)   36d
kube-controller-manager-k8sn24             1/1     Running   12 (19h ago)   36d
kube-proxy-277m8                           1/1     Running   12 (19h ago)   36d
kube-proxy-42sck                           1/1     Running   12 (19h ago)   36d
kube-proxy-97gc2                           1/1     Running   12 (19h ago)   36d
kube-proxy-c6h22                           1/1     Running   12 (19h ago)   36d
kube-proxy-fm7vj                           1/1     Running   12 (19h ago)   36d
kube-proxy-v75pg                           1/1     Running   12 (19h ago)   36d
kube-scheduler-k8sn22                      1/1     Running   12 (19h ago)   36d
kube-scheduler-k8sn23                      1/1     Running   12 (19h ago)   36d
kube-scheduler-k8sn24                      1/1     Running   12 (19h ago)   36d


[root@k8sn22 ~]# kubectl logs longhorn-ui-655b65f7f9-mdg5p -n longhorn-system
nginx: [emerg] host not found in upstream "longhorn-backend" in /var/config/nginx/nginx.conf:32

[root@k8sn22 ~]# kubectl -n longhorn-system describe svc longhorn-backend
Name:              longhorn-backend
Namespace:         longhorn-system
Labels:            app=longhorn-manager
                   app.kubernetes.io/instance=longhorn
                   app.kubernetes.io/name=longhorn
                   app.kubernetes.io/version=v1.6.1
Annotations:       <none>
Selector:          app=longhorn-manager
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.99.46.119
IPs:               10.99.46.119
Port:              manager  9500/TCP
TargetPort:        manager/TCP
Endpoints:         
Session Affinity:  None
Events:            <none>

[root@k8sn22 ~]# kubectl get svc -n longhorn-system
NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
longhorn-admission-webhook    ClusterIP   10.98.125.68     <none>        9502/TCP   23m
longhorn-backend              ClusterIP   10.99.46.119     <none>        9500/TCP   23m
longhorn-conversion-webhook   ClusterIP   10.106.104.244   <none>        9501/TCP   23m
longhorn-engine-manager       ClusterIP   None             <none>        <none>     23m
longhorn-frontend             ClusterIP   10.104.190.85    <none>        80/TCP     23m
longhorn-recovery-backend     ClusterIP   10.99.3.87       <none>        9503/TCP   23m
longhorn-replica-manager      ClusterIP   None             <none>        <none>     23m


[root@k8sn22 ~]# curl --noproxy '*' 10.98.125.68:9502
curl: (7) Failed to connect to 10.98.125.68 port 9502: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.106.104.244:9501
curl: (7) Failed to connect to 10.106.104.244 port 9501: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.104.190.85:80
curl: (7) Failed to connect to 10.104.190.85 port 80: No route to host
[root@k8sn22 ~]# curl --noproxy '*' 10.99.3.87:9503
curl: (7) Failed to connect to 10.99.3.87 port 9503: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.99.46.119:9500
curl: (7) Failed to connect to 10.99.46.119 port 9500: Connection refused



kubectl create -f https://k8s.io/examples/admin/dns/busybox.yaml

[root@k8sn22 ~]# kubectl exec -it busybox -n default -- sh
/ # 
/ # 
/ # 
/ # nslookup longhorn-conversion-webhook.longhorn-system.svc
Server:    10.96.0.10
Address 1: 10.96.0.10

nslookup: can't resolve 'longhorn-conversion-webhook.longhorn-system.svc'

from longhorn.

n00bsi avatar n00bsi commented on September 23, 2024

How to set http_proxy right
so that download of yaml files is possible
but that not proxy block the communication between the pods - may this is the issue why it will not run in my env ?

from longhorn.

derekbit avatar derekbit commented on September 23, 2024

cc @mantissahz

from longhorn.

mantissahz avatar mantissahz commented on September 23, 2024

Hi @n00bsi,
Do you create the cluster in an air-gap environment?

It should be a DNS issue, can you follow the KB https://longhorn.io/kb/troubleshooting-dns-resolution-failed/ to check if this problem can be solved?

You might need to log in a longhorn pod to test the commands.

[root@k8sn22 ~]# curl --noproxy '*' 10.98.125.68:9502
curl: (7) Failed to connect to 10.98.125.68 port 9502: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.106.104.244:9501
curl: (7) Failed to connect to 10.106.104.244 port 9501: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.104.190.85:80
curl: (7) Failed to connect to 10.104.190.85 port 80: No route to host
[root@k8sn22 ~]# curl --noproxy '*' 10.99.3.87:9503
curl: (7) Failed to connect to 10.99.3.87 port 9503: Connection refused
[root@k8sn22 ~]# curl --noproxy '*' 10.99.46.119:9500
curl: (7) Failed to connect to 10.99.46.119 port 9500: Connection refused

from longhorn.

n00bsi avatar n00bsi commented on September 23, 2024

Do you create the cluster in an air-gap environment?

It is behind a proxy
did set each node the proxy that
docker can download from repos

the dns issue I could not fix

I have a System with: libvirt
on this I installed 6 VMs
and inside this VMs I installed my k8s-cluster + vm's for load balancer

You might need to log in a longhorn pod to test the commands.

I will try tomorrw or on thursday

from longhorn.

github-actions avatar github-actions commented on September 23, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

from longhorn.

github-actions avatar github-actions commented on September 23, 2024

This issue was closed because it has been stalled for 5 days with no activity.

from longhorn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.