Comments (8)
After some digging in the code i think i might have found the reason for this behaviour.
Every heartbeat/write function, a new socket is created and then connect(dest_addr) is called, doing the TLS handshake.
After that establish_connection is called to do the hello, ping, pong protocol.
it seems that if after the TLS handshake, when we reach establish_connection, the connection drops between the fluentd and destination server, fluentd doesn't seem to detect or timeout the socket and as far as I can tell get stuck in this loop, where socket is still considered up, but will never contain any data.
I expect the socket to timeout if it failed to establish connection after some time.
Perhaps using IO.select.
Thanks for your report. We may need to do some digging.
We need a simple way to reproduce the problem.
from fluentd.
We're encountering a similar issue while distributing 80,000 events per second to two separate systems. Have there been any updates or workarounds identified for this?
from fluentd.
We're encountering a similar issue while distributing 80,000 events per second to two separate systems. Have there been any updates or workarounds identified for this?
No.
I have not made time to look into this issue in detail.
I'll see if I can reproduce it.
I'd be glad to receive any information that could help us reproduce the issue.
from fluentd.
Yes...
With the below configurations for forwarder and aggregator:
Both of these are in a Multi Process Workers environment, with 4 workers on each node.
<match udp.input.**>
@type forward
require_ack_response true
heartbeat_type udp
<buffer>
@type memory
flush_interval 1s
flush_thread_count 10
chunk_limit_size 50m
queue_limit_length 500
chunk_limit_size 100m
overflow_action drop_oldest_chunk
retry_max_interval 10m
retry_forever true
delayed_commit_timeout 100
</buffer>
<server>
host 10.10.1.2
port 24224
weight 60
</server>
<server>
host 10.10.1.3
port 24224
weight 60
</server>
</match>
<source>
@type forward
port 24224
bind 0.0.0.0
tag udp.forward
</source>
It appears that when attempting to distribute events, Node 1 receives events via UDP and then shares them with other nodes using the forwarder plugin. However, after approximately 10 seconds, fluentd enters a stale mode where it no longer accepts new incoming events and only forwards heartbeats. And I haven't seen any error logs stating the above behaviour.
After checking the network calls with strace we are suspecting this
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70df800000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, "{\"reportType\": \"sslsession\", \"so"..., 10485760, 0, {sa_family=AF_INET, sin_port=htons(38897), sin_addr=inet_addr("10.10.1.5")}, [2048 => 16]) = 59168
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54676] recvfrom(10, "\201\243ack\271YUgPCpFf8KyAOu0reJa0wg==\n", 512, 0, 0x7f70e6efd790, [2048 => 0]) = 31
[pid 54676] shutdown(10, SHUT_WR) = 0
[pid 54669] socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 10
[pid 54669] connect(10, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 54669] getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
[pid 54669] getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
[pid 54669] getsockname(10, {sa_family=AF_INET, sin_port=htons(51956), sin_addr=inet_addr("10.10.1.1")}, [2048 => 16]) = 0
[pid 54669] setsockopt(10, SOL_SOCKET, SO_LINGER, {l_onoff=1, l_linger=60}, 8) = 0
[pid 54669] setsockopt(10, SOL_SOCKET, SO_RCVTIMEO_OLD, "\276\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
[pid 54669] setsockopt(10, SOL_SOCKET, SO_SNDTIMEO_OLD, "<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
<PROCESS STALLS HERE FOR A WHILE, DURING WHICH ONLY PERIODIC HEARTBEAT SIGNALS ARE SENT TO THE DEVICE>
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54676] recvfrom(10, "\201\243ack\271YUgPDAI3vNb9cuJQbyVasg==\n", 512, 0, 0x7f70e6efd790, [2048 => 0]) = 31
[pid 54676] shutdown(10, SHUT_WR) = 0
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
<SKIPPING A FEW BEATS TO SHORTEN THE LOG>
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, 16) = 1
[pid 54675] sendto(9, "\0", 1, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, 16) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.2")}, [2048 => 16]) = 1
[pid 54675] recvfrom(9, "\0", 512, 0, {sa_family=AF_INET, sin_port=htons(24224), sin_addr=inet_addr("10.10.1.3")}, [2048 => 16]) = 1
<AGAIN PROCESS RESUMES HERE>
[pid 54679] recvfrom(13, 0x7f70dec00000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, "{\"reportType\": \"sslsession\", \"so"..., 10485760, 0, {sa_family=AF_INET, sin_port=htons(42461), sin_addr=inet_addr("10.10.1.5")}, [2048 => 16]) = 58968
[pid 54679] recvfrom(13, 0x7f70dd000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, "{\"reportType\": \"sslsession\", \"so"..., 10485760, 0, {sa_family=AF_INET, sin_port=htons(60693), sin_addr=inet_addr("10.10.1.5")}, [2048 => 16]) = 59165
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e0400000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, "{\"reportType\": \"sslsession\", \"so"..., 10485760, 0, {sa_family=AF_INET, sin_port=htons(60693), sin_addr=inet_addr("10.10.1.5")}, [2048 => 16]) = 59165
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, 0x7f70e1000000, 10485760, 0, 0x7f70e5afdaf0, [2048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 54679] recvfrom(13, "{\"reportType\": \"sslsession\", \"so"..., 10485760, 0, {sa_family=AF_INET, sin_port=htons(46832), sin_addr=inet_addr("10.10.1.5")}, [2048 => 16]) = 59071
If there's any other information needed on this issue, please let me know.
from fluentd.
did you manage to replicate the issue @daipom ?
from fluentd.
Sorry, I haven't made time for this.
Thanks for your information!
With the below configurations for forwarder and aggregator:
Does this mean that this issue reproduces within the same machine (setting in_forward
and out_forward
in the same config)?
If so, I think I could try to reproduce it.
Or, does this reproduce only under the following infrastructure (where in_forward
and out_forward
don't connect directly)?
It's important to take note that in our infrastructure, the out_forward and in_forward servers doesn't connect directly to each other, they has a few components in-between them, so if the overall connection drops it doesn't necessarily mean that the socket will drop, so we have to rely on options like timeouts.
from fluentd.
This setup is currently deployed in an Esxi host. We have three nodes deployed independently, and we have been able to reproduce the issue by continuously streaming across the nodes.
The actual flow will be like this
UDP -> Log Forwarder -> Forward aggregator -> Opensearch
I have three questions:
- If the forwarder gets stuck in the middle, why does the UDP plugin also stop receiving data? Only heartbeats are sent to the other nodes.
- Are there any other steps to produce dumps for the Ruby code to check what is happening in real-time?
- Are there any limitations regarding the amount of data that can be continuously transmitted by the forwarder?
from fluentd.
I'm observing the same issue in our setup on MacOS, it usually occurs when the macbook lid is closed. Based on the logs I noticed that macbook wakes up from time to time to do certain things and during these wake-ups fluentd is more likely to enter the infinite loop in the establish_connection.
I'm using TLS and heartbeat disabled.
I've added some extra error logging to the rescue IO::WaitReadable => e
and during normal operation the rescue statement triggers from time to time (usually no more than 10 retries):
2024-04-22 16:31:31 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=2 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:31 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=4 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:31 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=5 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:33 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=1 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:33 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=2 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:33 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=4 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:33 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=5 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-22 16:31:33 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=6 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
When it enter the infinite loop the error stays the same, but the retry count increase forever:
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30437 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30438 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30439 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30440 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30441 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30442 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30443 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
2024-04-23 09:47:03 +0200 [warn]: #0 IO::WaitReadable host="..." port=12345 retry_count=30444 error_class=OpenSSL::SSL::SSLErrorWaitReadable error="read would block"
My current workaround probably doesn't address the actual issue (why read_nonblock
infinitely returns retriable error), but is good enough to prevent infinite loop:
612a613
> retry_count = 0
627c628,636
< rescue IO::WaitReadable
---
> rescue IO::WaitReadable => e
> # On MacOS under certain circumstances read_nonblock will infinitely raise OpenSSL::SSL::SSLErrorWaitReadable
> # (error="read would block"). During normal operation retry_count usually does not exceed 10, thus we set the
> # limit to 25 to be on the safe side.
> if retry_count > 25
> @log.warn "retry count over 25 times", host: @host, port: @port, "last error": e
> disable!
> break
> end
630a640
> retry_count += 1
from fluentd.
Related Issues (20)
- Windows: td-agent 4.5.2 Too many open files HOT 1
- Buffer files remain after restart of fluentd HOT 5
- stdout P followed my infinite loop of "\\\\\\\\\" in cloudwatch log stream for fluentd
- Compression Support in out_http plugin
- fluent/lib/ruby/3.2.0/win32/registry.rb:910:in `encode': U+767E to ASCII-8BIT in conversion from UTF-16LE to UTF-8 to ASCII-8BIT (Encoding::UndefinedConversionError) HOT 1
- DiskPressure due to 80 GB /home/fluentd/core. files
- Set log_path in system config HOT 2
- Duplicate logs getting created on Opensearch through fluentd HOT 4
- Startup error using docker startup to specify profile location
- Out_file short flush interval with timekey 1h in buffer HOT 5
- in_tail error after upgrading from 3.8.1 to td-agent-4.5.2 HOT 3
- Fluentd logs HOT 1
- Support AWS SigV4 in the http output plugin HOT 3
- Buffer: v1.16.4: Emit error by IOError HOT 8
- Match directive not working HOT 1
- What will be the impact after removal of OpenSSL c_rehash script from td-agent
- in_exec: Can't handle non-ASCII characters output HOT 1
- Syslog TLS: [client_cert_auth false] settings is not applied if [insecure true] is not set. HOT 4
- [YAML] 'log_level' is deprecated parameter name. use '@log_level' instead // parameter '$log_level' is not used HOT 3
- Broken hadoop_snappy compression in some cases HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fluentd.