Code Monkey home page Code Monkey logo

Comments (18)

chriscchien avatar chriscchien commented on July 23, 2024 3

@chriscchien Also happened in v1.6.x and v1.5.x?

This issue can not be reproduced on v1.5.5 and v1.6.2.

from longhorn.

derekbit avatar derekbit commented on July 23, 2024 3

Uninstallation? If the backup target is valid, it won't trigger the issue.

ah, typo..

If the backup target is valid or invalid, all of them will walk through the UpdateBackupTarget call below. Why only the invalid case will be triggered? Is there anything I missed?

/controller/uninstall_controller.go#L322-L328

		} else if len(backupTargets) > 0 {
			for _, bt := range backupTargets {
				if _, err = c.ds.UpdateBackupTarget(bt); err != nil {
					return errors.Wrap(err, "failed to touch the backup target CR for API version migration")
				}
			}
		}

https://github.com/longhorn/longhorn-manager/blob/master/controller/backup_target_controller.go#L383-L391
It is due to the frequent update of an invalid backup target.
Although the error message is the same, a different timestamp always leads to a update.

from longhorn.

innobead avatar innobead commented on July 23, 2024 2

@chriscchien Also happened in v1.6.x and v1.5.x?

This issue can not be reproduced on v1.5.5 and v1.6.2.

We should backport this, as longhorn/longhorn-manager#2812 was backported to 1.6.2 and 1.5.6 (unreleased) already?

from longhorn.

derekbit avatar derekbit commented on July 23, 2024 1

cc @mantissahz

from longhorn.

ejweber avatar ejweber commented on July 23, 2024 1

If the backup target is valid or invalid, all of them will walk through the UpdateBackupTarget call below. Why only the invalid case will be triggered? Is there anything I missed?

I think it is very unlikely for there to be a conflict in the case of a valid BackupTarget. But for the invalid case, #8224 causes frequent updates, so a conflict is quite likely.

from longhorn.

derekbit avatar derekbit commented on July 23, 2024

@chriscchien Also happened in v1.6.x and v1.5.x?

from longhorn.

mantissahz avatar mantissahz commented on July 23, 2024

Quickly scan the longhorn-manager logs in the support bundle, in a second there are lots of

2024-06-20T04:18:44.848117979Z time="2024-06-20T04:18:44Z" level=error msg="Failed to get info from backup store" func="controller.(*BackupTargetController).reconcile" file="backup_target_controller.go:389" controller=longhorn-backup-target cred= error="failed to list backup volumes in nfs://longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore: error listing backup volume names: failed to execute: /var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-master-head/longhorn [/var/lib/longhorn/engine-binaries/longhornio-longhorn-engine-master-head/longhorn backup ls --volume-only nfs://longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore], output cannot mount nfs longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore, options [nfsvers=4.0 actimeo=1 soft timeo=300 retry=2]: vers=4.0: mount failed: exit status 32\nMounting command: mount\nMounting arguments: -t nfs4 -o nfsvers=4.0,actimeo=1,soft,timeo=300,retry=2 longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore /var/lib/longhorn-backupstore-mounts/longhorn-test-nfs-svc_defsdfsfdault/opt/backupstore\nOutput: mount.nfs4: Failed to resolve server longhorn-test-nfs-svc.defsdfsfdault: Name or service not known\n: vers=4.1: mount failed: exit status 32\nMounting command: mount\nMounting arguments: -t nfs4 -o nfsvers=4.1,actimeo=1,soft,timeo=300,retry=2 longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore /var/lib/longhorn-backupstore-mounts/longhorn-test-nfs-svc_defsdfsfdault/opt/backupstore\nOutput: mount.nfs4: Failed to resolve server longhorn-test-nfs-svc.defsdfsfdault: Name or service not known\n: vers=4.2: mount failed: exit status 32\nMounting command: mount\nMounting arguments: -t nfs4 -o nfsvers=4.2,actimeo=1,soft,timeo=300,retry=2 longhorn-test-nfs-svc.defsdfsfdault:/opt/backupstore /var/lib/longhorn-backupstore-mounts/longhorn-test-nfs-svc_defsdfsfdault/opt/backupstore\nOutput: mount.nfs4: Failed to resolve server longhorn-test-nfs-svc.defsdfsfdault: Name or service not known\n: cannot mount using NFSv4\n, stderr warning: GOCOVERDIR not set, no coverage data emitted\ntime=\"2024-06-20T04:18:44Z\" level=warning msg=\"Trying reading mount point /var/lib/longhorn-backupstore-mounts/longhorn-test-nfs-svc_defsdfsfdault/opt/backupstore to make sure it is healthy\" func=util.EnsureMountPoint file=\"util.go:309\" pkg=nfs\ntime=\"2024-06-20T04:18:44Z\" l
...

And frequently updating the backup target status will block the uninstall procedure.
It should poll the backup target status in pollInterval.

from longhorn.

innobead avatar innobead commented on July 23, 2024

What's the poll interview set to cause this frequent update? The original should be 300 seconds.

The workaround would be to disable it by setting 0?

from longhorn.

mantissahz avatar mantissahz commented on July 23, 2024

I think the frequent update is caused by that the error messages is not the same (including time like E0620 04:18:45.681625 20213 mount_linux.go:236])

The workaround would be to empty the backup target url first.

from longhorn.

derekbit avatar derekbit commented on July 23, 2024
" controller=longhorn-uninstall error="failed to touch the backup target CR for API version migration: Operation cannot be fulfilled on backuptargets.longhorn.io \"default\": the object has been modified; please apply your changes to the latest version and try again"

If it runs into the object has been modified, the failed update can be ignored. The purpose of the touch (update) is to trigger version migration. The error the object has been modified indicates the resource is already updated and should be migrated.

from longhorn.

longhorn-io-github-bot avatar longhorn-io-github-bot commented on July 23, 2024

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:
  1. Setup invalid backup target
  2. Uninstall Longhorn
  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:
  • Empty the backup target URL.
  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at
    longhorn/longhorn-manager#2897

  • Which areas/issues this PR might have potential impacts on?
    Area
    Issues

from longhorn.

ejweber avatar ejweber commented on July 23, 2024

I think the frequent update is caused by that the error messages is not the same (including time like E0620 04:18:45.681625 20213 mount_linux.go:236])

Related to #8224.

from longhorn.

innobead avatar innobead commented on July 23, 2024
" controller=longhorn-uninstall error="failed to touch the backup target CR for API version migration: Operation cannot be fulfilled on backuptargets.longhorn.io \"default\": the object has been modified; please apply your changes to the latest version and try again"

Does this mean that if Longhorn is installed from the master branch (w/o the fix) with a valid/invalid backup target configured, the installation will always fail?

from longhorn.

derekbit avatar derekbit commented on July 23, 2024
" controller=longhorn-uninstall error="failed to touch the backup target CR for API version migration: Operation cannot be fulfilled on backuptargets.longhorn.io \"default\": the object has been modified; please apply your changes to the latest version and try again"

Does this mean that if Longhorn is installed from the master branch (w/o the fix) with a valid/invalid backup target configured, the installation will always fail?

Uninstallation? If the backup target is valid, it won't trigger the issue.

from longhorn.

innobead avatar innobead commented on July 23, 2024

Uninstallation? If the backup target is valid, it won't trigger the issue.

ah, typo..

If the backup target is valid or invalid, all of them will walk through the UpdateBackupTarget call below. Why only the invalid case will be triggered? Is there anything I missed?

/controller/uninstall_controller.go#L322-L328

		} else if len(backupTargets) > 0 {
			for _, bt := range backupTargets {
				if _, err = c.ds.UpdateBackupTarget(bt); err != nil {
					return errors.Wrap(err, "failed to touch the backup target CR for API version migration")
				}
			}
		}

from longhorn.

derekbit avatar derekbit commented on July 23, 2024

Uninstallation? If the backup target is valid, it won't trigger the issue.

ah, typo..

If the backup target is valid or invalid, all of them will walk through the UpdateBackupTarget call below. Why only the invalid case will be triggered? Is there anything I missed?

/controller/uninstall_controller.go#L322-L328

		} else if len(backupTargets) > 0 {
			for _, bt := range backupTargets {
				if _, err = c.ds.UpdateBackupTarget(bt); err != nil {
					return errors.Wrap(err, "failed to touch the backup target CR for API version migration")
				}
			}
		}

https://github.com/longhorn/longhorn-manager/blob/master/controller/backup_target_controller.go#L394-L397
It is due to the frequent update of an invalid backup target.
Although the error message is the same, a different timestamp always leads to an update.

from longhorn.

innobead avatar innobead commented on July 23, 2024

Well explained @derekbit @ejweber

from longhorn.

chriscchien avatar chriscchien commented on July 23, 2024

Verified pass on longhorn master(longhorn-manager b19161) with test steps

Uninstallation success when invalid backuptarget is set.

from longhorn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.