Comments (7)
After rebuilding the image, spdk_tgt works well after enabling v2 data engine
╰─$ kl logs instance-manager-c6c31c104ad17da991e0c9885616d84c -f
Enabling tgtd
Enabling spdk_tgt with options:
tgtd: work_timer_start(146) use timer_fd based scheduler
tgtd: bs_init(387) use signalfd notification
time="2024-02-27T15:20:04Z" level=info msg="Storing process logs at path: /var/log/instances" func=util.SetUpLogger file="log.go:63"
[longhorn-instance-manager] time="2024-02-27T15:20:04Z" level=warning msg="Failed to add TLS key pair from /tls-files/" func=cmd.start file="start.go:120" error="open /tls-files/tls.crt: no such file or directory"
[longhorn-instance-manager] time="2024-02-27T15:20:04Z" level=info msg="Creating gRPC server with no auth" func=cmd.start file="start.go:127"
[longhorn-instance-manager] time="2024-02-27T15:20:04Z" level=info msg="Disk Server: Creating SPDK client since SPDK is enabled" func=disk.NewServer file="disk.go:48"
[2024-02-27 15:20:04.046950] Starting SPDK v23.05-pre git sha1 69f656824 / DPDK 22.11.1 initialization...
[2024-02-27 15:20:04.047008] [ DPDK EAL parameters: spdk_tgt --no-shconf -c 0x1 --huge-unlink --log-level=lib.eal:6 --log-level=lib.cryptodev:5 --log-level=user1:6 --iova-mode=pa --base-virtaddr=0x200000000000 --match-allocations --file-prefix=spdk_pid12 ]
TELEMETRY: No legacy callbacks, legacy socket not created
[2024-02-27 15:20:04.160697] app.c: 738:spdk_app_start: *NOTICE*: Total cores available: 1
[2024-02-27 15:20:04.184774] reactor.c: 937:reactor_run: *NOTICE*: Reactor started on core 0
[2024-02-27 15:20:04.192480] accel_sw.c: 601:sw_accel_module_init: *NOTICE*: Accel framework software module initialized.
[longhorn-instance-manager] time="2024-02-27T15:20:05Z" level=info msg="Instance Manager disk gRPC server listening to 0.0.0.0:8502" func=cmd.start file="start.go:149"
[longhorn-instance-manager] time="2024-02-27T15:20:05Z" level=info msg="Instance Manager instance gRPC server listening to 0.0.0.0:8503" func=cmd.start file="start.go:164"
[longhorn-instance-manager] time="2024-02-27T15:20:05Z" level=info msg="Instance Manager proxy gRPC server listening to 0.0.0.0:8501" func=cmd.start file="start.go:178"
[longhorn-instance-manager] time="2024-02-27T15:20:05Z" level=info msg="Instance Manager process manager gRPC server listening to 0.0.0.0:8500" func=cmd.start file="start.go:193"
[longhorn-instance-manager] time="2024-02-27T15:20:05Z" level=info msg="Instance Manager SPDK gRPC server listening to 0.0.0.0:8504" func=cmd.start file="start.go:208"
from longhorn.
@derekbit Please help check this?
from longhorn.
For longhorn-manager, the change between the GA candidate with RC4.
from longhorn.
from @derekbit suspecting a package issue from the IM image.
2024-02-27T13:59:51.009397364Z Enabling tgtd
2024-02-27T13:59:51.009883651Z Enabling spdk_tgt with options:
2024-02-27T13:59:51.012943786Z spdk_tgt: error while loading shared libraries: libnuma.so.1: cannot open shared object file: No such file or directory
2024-02-27T13:59:51.014689467Z tgtd: work_timer_start(146) use timer_fd based scheduler
2024-02-27T13:59:51.014702672Z tgtd: bs_init(387) use signalfd notification
2024-02-27T13:59:51.039490192Z time="2024-02-27T13:59:51Z" level=info msg="Storing process logs at path: /var/log/instances" func=util.SetUpLogger file="log.go:63"
2024-02-27T13:59:51.039516390Z [longhorn-instance-manager] time="2024-02-27T13:59:51Z" level=warning msg="Failed to add TLS key pair from /tls-files/" func=cmd.start file="start.go:120" error="open /tls-files/tls.crt: no such file or directory"
2024-02-27T13:59:51.039532060Z [longhorn-instance-manager] time="2024-02-27T13:59:51Z" level=info msg="Creating gRPC server with no auth" func=cmd.start file="start.go:127"
2024-02-27T13:59:51.039683908Z [longhorn-instance-manager] time="2024-02-27T13:59:51Z" level=info msg="Disk Server: Creating SPDK client since SPDK is enabled" func=disk.NewServer file="disk.go:48"
from longhorn.
libnuma packages were missed in the 1.5.4 IM image. Rebuild 1.5.4 IM image to see if the package issue will be fixed.
cc @longhorn/qa
$ docker run -it --entrypoint /bin/bash longhornio/longhorn-instance-manager:v1.5.4
77b82b9e10d6:/ # zypper install -y libnuma-devel nasm
Refreshing service 'container-suseconnect-zypp'.
Loading repository data...
Reading installed packages...
Resolving package dependencies...
The following 3 NEW packages are going to be installed:
libnuma-devel libnuma1 nasm
The following 3 packages are not supported by their vendor:
libnuma-devel libnuma1 nasm
3 new packages to install.
Overall download size: 466.4 KiB. Already cached: 0 B. After the operation, additional 3.1 MiB will be used.
Continue? [y/n/v/...? shows all options] (y): y
Retrieving: libnuma1-2.0.14.20.g4ee5e0c-150400.1.24.x86_64 (SLE_BCI) (1/3), 31.8 KiB
Retrieving: libnuma1-2.0.14.20.g4ee5e0c-150400.1.24.x86_64.rpm ..........................................................................................................................................................................[done]
Retrieving: nasm-2.15.05-150500.1.2.x86_64 (SLE_BCI) (2/3), 406.6 KiB
Retrieving: nasm-2.15.05-150500.1.2.x86_64.rpm ..............................................................................................................................................................................[done (8.0 KiB/s)]
Retrieving: libnuma-devel-2.0.14.20.g4ee5e0c-150400.1.24.x86_64 (SLE_BCI) (3/3), 28.0 KiB
Retrieving: libnuma-devel-2.0.14.20.g4ee5e0c-150400.1.24.x86_64.rpm ........................................................................................................................................................[done (28.0 KiB/s)]
Checking for file conflicts: ............................................................................................................................................................................................................[done]
(1/3) Installing: libnuma1-2.0.14.20.g4ee5e0c-150400.1.24.x86_64 ........................................................................................................................................................................[done]
(2/3) Installing: nasm-2.15.05-150500.1.2.x86_64 ........................................................................................................................................................................................[done]
(3/3) Installing: libnuma-devel-2.0.14.20.g4ee5e0c-150400.1.24.x86_64 ...................................................................................................................................................................[done]
77b82b9e10d6:/ #
from longhorn.
@chriscchien Please help test the rebuilt IM 1.5.4 image. It's just ready. (remember to remove the image already downloaded)
➜ longhorn docker run -it --entrypoint /bin/bash longhornio/longhorn-instance-manager:v1.5.4
Unable to find image 'longhornio/longhorn-instance-manager:v1.5.4' locally
v1.5.4: Pulling from longhornio/longhorn-instance-manager
fc4ecc75782c: Already exists
39125a118324: Pull complete
911face13969: Pull complete
707b82a5bd0c: Pull complete
c48a458a3078: Pull complete
93961f07118a: Pull complete
aef6ae5b6908: Pull complete
d6e3a244660c: Pull complete
a13d0bfd9183: Pull complete
1f27396f6efc: Pull complete
fe556ec02776: Pull complete
Digest: sha256:3d4a1d839941bdd0cffa9b765c74984c1f5e677dce356c1f7eb815fcffaa0697
Status: Downloaded newer image for longhornio/longhorn-instance-manager:v1.5.4
16b70640f78b:/ # zypper se -i | grep libnuma
i+ | libnuma-devel | NUMA Policy Control | package
i | libnuma1
from longhorn.
Close this as @derekbit has already verified it. It appears to be an uncertain build issue with the image's package manager.
@chriscchien reopen if this is still an issue stuck in our final testing of 1.5.4.
from longhorn.
Related Issues (20)
- Add canonical links for SEO
- Almalinux 9 - longhorn-manager CrashLoopBackOff HOT 4
- Go-live checklist
- [TEST] Negative test case `Stress Volume Node Memory When Volume Is Offline Expanding` failed: `KeyError: 'test.longhorn.io/last-recorded-expanded-size'` HOT 1
- [CI] Add `xfstests` (filesystem testing suite) in CI test
- [BUG] Error get size (backups)
- [BUG] Negative test case got stuck in waiting for longhorn-ui pods HOT 1
- [BUG] nfs-ganesha-56[svc_89253] nfs4_Errno_verbose :NFS4 :CRIT :Error I/0 error with error code 5 in nfs4_op_commit converted to NFS4ERR_IO but was set non-retryable HOT 4
- [BUG] RWX volume is hang on Photon OS HOT 34
- Expected baseline p99 latency on a synchronous writes from any pod throughout the cluster HOT 1
- [BACKPORT][v1.5.5][BUG] Replica rebuild failed
- [BACKPORT][v1.6.1][BUG] Replica rebuild failed HOT 3
- [TEST] Migrate existing pytest test cases to robot
- [TEST] Fix flaky test_engine_image_not_fully_deployed_perform_auto_upgrade_engine
- [BUG] Extra replica created when create volume in a engine image not fully deployed environment HOT 3
- [IMPROVEMENT] Longhor Manager Flood with "Failed to get engine proxy of ... cannot get client for engine" Message HOT 2
- [TEST] Fix flaky test_volume_metrics
- [BUG][v1.6.1-rc3] test_backup_lock_creation_during_deletion failed with *.lck type 1 acquisition HOT 4
- [TASK] Bump base images to SLES 15.6 HOT 2
- [TASK] Downgrade the package `github.com/prometheus/common` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from longhorn.