Since the nfs example in <a href="https://github.com/kubernetes/examples/tree/mast

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to debug timeout after 2 min for trying to attach the volume about csi-driver-nfs HOT 9 CLOSED

kubernetes-csi commented on September 15, 2024

How to debug timeout after 2 min for trying to attach the volume

from csi-driver-nfs.

Comments (9)

msau42 commented on September 15, 2024

If you created the PVC/PV before starting the CSI driver, you may be hitting the issue above.

Can you try recreating the Pod and PVC/PV?

from csi-driver-nfs.

ensonic commented on September 15, 2024

Its indeed the ordering, but now I bump into the same problem as with the in-tree nfs:

MountVolume.SetUp failed for volume "data" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Internal desc = mount failed: exit status 32
    Mounting command: mount
    Mounting arguments: -t nfs nfs-server.app-data.svc.cluster.local:/exports /var/lib/kubelet/pods/93e33295-23bb-43a7-96ce-aac12161ca2e/volumes/kubernetes.io~csi/data/mount
    Output: mount.nfs: Failed to resolve server nfs-server.app-data.svc.cluster.local: Name or service not known

from csi-driver-nfs.

ensonic commented on September 15, 2024

I've split my helm chart into two. One to create the csi plugin (everything from https://github.com/kubernetes-csi/csi-driver-nfs/tree/master/deploy/kubernetes + the nfs server) and a 2nd one to deploy a set of clients + pv + pvc and that kind of works, except that also the csi plugin cannot resolve an in cluster service name (nfs-server.nfs-server.svc.cluster.local).

Will check how the csi plugin works and how it does the mount. maybe it could attempt to resolve the name first before building the mount command-line? Like here:
https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/pkg/nfs/nodeserver.go#L61
inside the k8s mount utils this is probably too late:

csi-driver-nfs/vendor/k8s.io/kubernetes/pkg/util/mount/exec_mount.go

Line 46 in 1bb6a86

    
           func (m *execMounter) Mount(source string, target string, fstype string, options []string) error {

On the note that it works, it has a huge negative impact on pod start/stop though. Like a pod that before is up and running after 20 sec takes now >100 sec. From looking at the kubelet logs I would blame to large reconcile intervals. Will need to dig the code for that too.

from csi-driver-nfs.

msau42 commented on September 15, 2024

That's very odd. If you look at csi driver logs, how long does the NodePublishVolume call take?

from csi-driver-nfs.

ensonic commented on September 15, 2024

The daemonset has 2 containers: node-driver-registrar and nfs. Both don't log anything during the mount operation. The 'node-driver-registrar' already runs with `--v=5', but there is not a lot of logging in the code.

In the kubelet logs (sudo journalctl --follow -u kubelet.service) I see a bunch of

nestedpendingoperations.go:270] Operation for "\"kubernetes.io/csi/nfs.csi.k8s.io^data\" (\"009c7766-24cf-4664-8bc6-3b4ca55c1b76\")" failed. No retries permitted until 2020-07-28 10:53:33.034869159 +0200 CEST m=+584315.996920652 (durationBeforeRetry 2m2s). Error: "UnmountVolume.TearDown failed for volume \"data\" (UniqueName: \"kubernetes.io/csi/nfs.csi.k8s.io^data\") pod \"009c7766-24cf-4664-8bc6-3b4ca55c1b76\" (UID: \"009c7766-24cf-4664-8bc6-3b4ca55c1b76\") : kubernetes.io/csi: mounter.TearDownAt failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"

nestedpendingoperations.go:270] Operation for "\"kubernetes.io/csi/nfs.csi.k8s.io^data\" (\"9ced9b7f-3fd0-4c08-a10b-72721443afb1\")" failed. No retries permitted until 2020-07-28 10:53:33.035080261 +0200 CEST m=+584315.997131777 (durationBeforeRetry 2m2s). Error: "UnmountVolume.TearDown failed for volume \"data\" (UniqueName: \"kubernetes.io/csi/nfs.csi.k8s.io^data\") pod \"9ced9b7f-3fd0-4c08-a10b-72721443afb1\" (UID: \"9ced9b7f-3fd0-4c08-a10b-72721443afb1\") : kubernetes.io/csi: mounter.TearDownAt failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"

goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/csi-nfsplugin/csi.sock" failed. No retries permitted until 2020-07-28 10:54:01.164623738 +0200 CEST m=+584344.126675285 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/csi-nfsplugin/csi.sock, err: rpc error: code = Unimplemented desc = unknown service pluginregistration.Registration"

No idea if that is expected.

from csi-driver-nfs.

fejta-bot commented on September 15, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from csi-driver-nfs.

fejta-bot commented on September 15, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from csi-driver-nfs.

fejta-bot commented on September 15, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from csi-driver-nfs.

k8s-ci-robot commented on September 15, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from csi-driver-nfs.

How to debug timeout after 2 min for trying to attach the volume about csi-driver-nfs HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent