we are experiencing crashes in collector containers across all nodes in one of our OpenShift clusters.
Collector Version: 3.9.0
OS: Red Hat Enterprise Linux CoreOS 49.84.202205050701-0 (Ootpa)
Kernel Version: 4.18.0-305.45.1.el8_4.x86_64
Starting StackRox Collector...
[I 20220926 112218 HostInfo.cpp:126] Hostname: '<redacted>'
[I 20220926 112218 CollectorConfig.cpp:119] User configured logLevel=debug
[I 20220926 112218 CollectorConfig.cpp:149] User configured collection-method=kernel_module
[I 20220926 112218 CollectorConfig.cpp:206] Afterglow is enabled
[D 20220926 112218 HostInfo.cpp:200] EFI directory exist, UEFI boot mode
[D 20220926 112218 HostInfo.h:100] identified kernel release: '4.18.0-305.45.1.el8_4.x86_64'
[D 20220926 112218 HostInfo.h:101] identified kernel version: '#1 SMP Wed Apr 6 13:48:37 EDT 2022'
[D 20220926 112218 HostInfo.cpp:297] SecureBoot status is 2
[D 20220926 112218 collector.cpp:254] Core dump not enabled
[I 20220926 112218 collector.cpp:302] Module version: 2.0.1
[I 20220926 112218 collector.cpp:329] Attempting to download kernel module - Candidate kernel versions:
[I 20220926 112218 collector.cpp:331] 4.18.0-305.45.1.el8_4.x86_64
[D 20220926 112218 GetKernelObject.cpp:148] Checking for existence of /kernel-modules/collector-4.18.0-305.45.1.el8_4.x86_64.ko.gz and /kernel-modules/collector-4.18.0-305.45.1.el8_4.x86_64.ko
[D 20220926 112218 GetKernelObject.cpp:151] Found existing compressed kernel object.
[I 20220926 112218 collector.cpp:262]
[I 20220926 112218 collector.cpp:263] This product uses kernel module and ebpf subcomponents licensed under the GNU
[I 20220926 112218 collector.cpp:264] GENERAL PURPOSE LICENSE Version 2 outlined in the /kernel-modules/LICENSE file.
[I 20220926 112218 collector.cpp:265] Source code for the kernel module and ebpf subcomponents is available upon
[I 20220926 112218 collector.cpp:266] request by contacting [email protected].
[I 20220926 112218 collector.cpp:267]
[I 20220926 112218 collector.cpp:162] Inserting kernel module /module/collector.ko with indefinite removal and retry if required.
[D 20220926 112218 collector.cpp:109] Kernel module arguments: s_syscallIds=26,27,56,57,246,247,248,249,94,95,14,15,156,157,216,217,222,223,4,5,22,23,12,13,154,155,172,173,214,215,230,231,282,283,288,289,292,293,96,97,182,183,218,219,224,225,16,186,234,194,195,192,193,200,201,198,199,36,37,18,19,184,185,220,221,226,227,-1 verbose=0 exclude_selfns=1 exclude_initns=1
[I 20220926 112218 collector.cpp:183] Done inserting kernel module /module/collector.ko.
[I 20220926 112218 collector.cpp:215] gRPC server=sensor.mcs-security.svc:443
[I 20220926 112218 CollectorService.cpp:50] Config: collection_method:kernel_module, useChiselCache:1, snapLen:0, scrape_interval:30, turn_off_scrape:0, hostname:<redacted>, logLevel:DEBUG
[I 20220926 112218 CollectorService.cpp:79] Network scrape interval set to 30 seconds
[I 20220926 112218 CollectorService.cpp:82] Waiting for GRPC server to become ready ...
[I 20220926 112218 CollectorService.cpp:87] GRPC server connectivity is successful
[D 20220926 112218 ConnTracker.cpp:314] ignored l4 protocol and port pairs
[D 20220926 112218 ConnTracker.cpp:316] udp/9
[I 20220926 112218 NetworkStatusNotifier.cpp:187] Started network status notifier.
[I 20220926 112218 NetworkStatusNotifier.cpp:203] Established network connection info stream.
[D 20220926 112218 SysdigService.cpp:262] Updating chisel and flushing chisel cache
[D 20220926 112218 SysdigService.cpp:263] New chisel:
args = {}
function on_event()
return true
end
function on_init()
filter = "not container.id = 'host'\n"
chisel.set_filter(filter)
return true
end
[I 20220926 112218 SignalServiceClient.cpp:43] Trying to establish GRPC stream for signals ...
[I 20220926 112218 SignalServiceClient.cpp:61] Successfully established GRPC stream for signals.
[D 20220926 112219 ConnScraper.cpp:406] Could not open process directory 1626873: No such file or directory
[D 20220926 112219 ConnScraper.cpp:406] Could not open process directory 1626877: No such file or directory
[W 20220926 112219 ProtoAllocator.h:41] Allocating a memory block on the heap for the arena, this is inefficient and usually avoidable
collector[0x44746d]
/lib64/libc.so.6(+0x4eb20)[0x7f8425ceeb20]
Caught signal 11 (SIGSEGV): Segmentation fault
/bootstrap.sh: line 94: 11 Segmentation fault (core dumped) eval exec "$@"
Collector kernel module has already been loaded.
Removing so that collector can insert it at startup.
I am not sure how to debug this as all daemonSet containers experience this problem.
We are using StackRox 3.71.0. I have tried with collector images 3.9.0
and 3.11.0
. Please reach out for any missing information.