Comments (7)
Hello @roger-ryao, can you please confirm that you also ran longhorn/longhorn-manager#2685 (comment) to test this fix? Sorry for the confusion. I do think longhorn/longhorn-manager#2432 (comment) should be run again (as you have done), but the former are the primary test steps.
Moving this back to Ready for Testing
. Please feel free to close it again directly if you have already run these steps.
from longhorn.
This is correct. The necessary changes did not make it to longhorn-manager in https://github.com/longhorn/longhorn-manager/commits/v1.6.1-rc1. They are only in https://github.com/longhorn/longhorn-manager/commits/v1.6.x/.
cc @roger-ryao
from longhorn.
ref: #8167
cc @ejweber
from longhorn.
Pre Ready-For-Testing Checklist
-
Where is the reproduce steps/test steps documented?
The reproduce steps/test steps are at: longhorn/longhorn-manager#2685 (comment).
Run the following test again: longhorn/longhorn-manager#2432 (comment). -
Is there a workaround for the issue? If so, where is it documented?
@krohrsb managed to mitigate it by increasing the capability of his API server (#8114 (comment)), but there is nothing a user can do to tweak Longhorn to avoid the issue. -
Does the PR include the explanation for the fix or the feature?
-
Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
The PR for the YAML change is at: #8167.
The PR for the chart change is at: #8167. -
Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including
backport-needed/*
)?
The PR is at: longhorn/longhorn-manager#2694.
from longhorn.
Verified on v1.6.1-rc1 20240314
- longhorn
v1.6.1-rc1
6e96e27
The test steps
longhorn/longhorn-manager#2432 (comment)
#7425 (comment)
Result Passed
- If you wait any length of time,
ReplicaA
is never cleaned up. - After deleting
ReplicaB
and waiting for the volume to rebuild fromReplicaA
, the data is consistent.
from longhorn.
Verified on v1.6.1-rc1 20240318
- longhorn
v1.6.1-rc1
6e96e27
The test steps
longhorn/longhorn-manager#2685 (comment)
There are four cases to verify:
- Ensure that the upgrade correctly populates
replicaTransitionTimeMap
- Confirm that the rebuild operation correctly populates
replicaTransitionTimeMap
andlastHealthyAt
after replica deletion - Validate that the rebuild operation correctly populates
replicaTransitionTimeMap
andlastHealthyAt
after replica failure - Ensure that replicas are not updated unnecessarily when the cluster is stable.
Result
- We observed that the
replicaTransitionTimeMap
is not visible onv1.6.1-rc1
, but the test passed on themaster-head
. - Attaching the screenshot, PUT requests to replica resources persist. Additionally, after updating Longhorn from
v1.6.1-rc1
to themaster-head
, no PUT requests to replica resources were observed.
cc. @ejweber
from longhorn.
Verified on v1.6.x-head 20240319
- longhorn
v1.6.x-head
6e96e27 - longhorn-manager
v1.6.x-head
https://github.com/longhorn//longhorn-manager/commit/0dd766f7376b13bc6ecc1c94f6b7a311d72e84e8
Result Passed
- 1. We observed that the
replicaTransitionTimeMap
is visible onv1.6.x-head
. - 2. The rebuild operation correctly populates
replicaTransitionTimeMap
andlastHealthyAt
after replica deletion - 3. The rebuild operation correctly populates
replicaTransitionTimeMap
andlastHealthyAt
after replica failure - 4. After updating Longhorn from
v1.6.1-rc1
to thev1.6.x-head
, no PUT requests to replica resources were observed.
from longhorn.
Related Issues (20)
- [BUG]3 replicas in same node on same disk and keeps failing HOT 1
- [BUG] `csi-resizer` restarts due to `"Lost connection" address="unix:///csi/csi.sock"` HOT 3
- [BUG] AttachVolume.Attach failed for volume: volume is not ready for workloads HOT 5
- [TEST] Develop a script to generate Longhorn YAML with version replacement
- [TEST] robot test case `Reboot Volume Node While Replica Rebuilding` fails
- [BUG] PV Annotation Isn't Updated After Creating An Oversize Volume HOT 1
- [BACKPORT][v1.7.2][BUG] PV Annotation Isn't Updated After Creating An Oversize Volume
- [BUG] longhorn-spdk-engine fail to build HOT 3
- [TEST] add v2 volume support for Harvester hal lab
- [TEST] Update robot test `README` and `test.yaml`
- [IMPROVEMENT] longhorn-manager stops reporting sprint release <version>-dev-<date> HOT 4
- [BUG] Instance-manage are redeployed every hour HOT 6
- [BACKPORT][v1.6.4][IMPROVEMENT] longhorn-manager stops reporting sprint release <version>-dev-<date> HOT 1
- [BACKPORT][v1.5.6][IMPROVEMENT] longhorn-manager stops reporting sprint release <version>-dev-<date> HOT 1
- [BACKPORT][v1.7.2][IMPROVEMENT] longhorn-manager stops reporting sprint release <version>-dev-<date> HOT 1
- High CPU usage by longhorn-instance-manager pod in 1.6.2 version HOT 2
- [BUG] grpc: error while marshaling: string field contains invalid UTF-8 HOT 4
- [BUG] Sprint release image versions not fixed to specific tag (e.g., v1.8.0-dev-20240922)
- [BUG] All Backups are lost in the Backup Target if the NFS Service Disconnects and Reconnects again HOT 4
- Some backups are empty and cannot be restored HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from longhorn.