Comments (3)
For 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p, the final ping for keepalive is:
[2023-05-06 20:46:43.015978 +08:00] INFO [ThreadId(19)] [component\cyfs-bdt\src\sn\client\ping\clients.rs:461] PingClients{local:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p} ping-resp, sn: 5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt/L4udp120.24.55.87:8070, seq: 4211558552.
[2023-05-06 20:47:08.043060 +08:00] INFO [ThreadId(19)] [component\cyfs-bdt\src\sn\client\ping\clients.rs:461] PingClients{local:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p} ping-resp, sn: 5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt/L4udp120.24.55.87:8070, seq: 4211558553.
And 5bnZHzZXgN4JsbKaFsqZ75HczHHh5dsrcAh9Ake4wKwA connect 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p
failure starts from:
[2023-05-06 20:57:30.856635 +08:00] ERROR [ThreadId(7)] [component\cyfs-bdt\src\tunnel\builder\connect_stream\builder.rs:328] ConnectStreamBuilder{stream:StreamContainer {sequence:TempSeq(4211602500), local:5bnZHzZXgN4JsbKaFsqZ75HczHHh5dsrcAh9Ake4wKwA, remote:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p, port:84, id:1638150600 }} call sn session failed, sn=5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt, err=err: (NotFound, sn response error, None)
The sn-miner use client_ping_timeout(default is 5min) to purge clients' cache, so, it's because 5bn..55p hasn't pinged from keep alive for too long, SN thinks it has dropped, resulting in a NotFound when 5bn..KwA calls, thus the connection fails.
from cyfs.
For 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p, the final ping for keepalive is: [2023-05-06 20:46:43.015978 +08:00] INFO [ThreadId(19)] [component\cyfs-bdt\src\sn\client\ping\clients.rs:461] PingClients{local:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p} ping-resp, sn: 5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt/L4udp120.24.55.87:8070, seq: 4211558552. [2023-05-06 20:47:08.043060 +08:00] INFO [ThreadId(19)] [component\cyfs-bdt\src\sn\client\ping\clients.rs:461] PingClients{local:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p} ping-resp, sn: 5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt/L4udp120.24.55.87:8070, seq: 4211558553.
And 5bnZHzZXgN4JsbKaFsqZ75HczHHh5dsrcAh9Ake4wKwA connect 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p failure starts from: [2023-05-06 20:57:30.856635 +08:00] ERROR [ThreadId(7)] [component\cyfs-bdt\src\tunnel\builder\connect_stream\builder.rs:328] ConnectStreamBuilder{stream:StreamContainer {sequence:TempSeq(4211602500), local:5bnZHzZXgN4JsbKaFsqZ75HczHHh5dsrcAh9Ake4wKwA, remote:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p, port:84, id:1638150600 }} call sn session failed, sn=5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt, err=err: (NotFound, sn response error, None)
The sn-miner use client_ping_timeout(default is 25s) to purge clients' cache, so, it's because 5bn..55p hasn't pinged from keep alive for too long, SN thinks it has dropped, resulting in a NotFound when 5bn..KwA calls, thus the connection fails.
So this problem, is also bdt's sn ping stopped unexpectedly, resulting in the problem of being considered offline by the SN server?
It should be the same or similar problem as the one below
#250
So we need to review the SN ping logic inside the bdt stack to see what would cause the ping loop to be aborted
from cyfs.
All tasks are blocked, causing the bdt-stack is unable to communicate properly:
[2023-05-06 20:47:19.246450 +08:00] WARN [ThreadId(7)] [component\cyfs-stack\src\interface\http_bdt_listener.rs:198] bdt http request complete with error! status=501, seq=TempSeq(4211563355), during=341ms
[2023-05-06 20:48:00.881778 +08:00] INFO [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:145] process still alive ThreadId(12), 1.1.1.83-beta (23-05-04)
[2023-05-06 20:49:00.882825 +08:00] INFO [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:145] process still alive ThreadId(12), 1.1.1.83-beta (23-05-04)
[2023-05-06 20:50:00.883364 +08:00] INFO [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:145] process still alive ThreadId(12), 1.1.1.83-beta (23-05-04)
[2023-05-06 20:51:00.885583 +08:00] INFO [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:145] process still alive ThreadId(12), 1.1.1.83-beta (23-05-04)
[2023-05-06 20:52:00.887406 +08:00] INFO [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:145] process still alive ThreadId(12), 1.1.1.83-beta (23-05-04)
[2023-05-06 20:52:00.887488 +08:00] ERROR [ThreadId(12)] [component\cyfs-debug\src\check\dead.rs:122] task system dead timeout, now will exit process! last_active=13327850820839775
After 5 minutes and 30 seconds, when querying 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p through sn, it will definitely not be found, and during this period, there will be no communication with it:
SN remove 5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p's cache item, so not found:
[2023-05-06 20:57:30.856635 +08:00] ERROR [ThreadId(7)] [component\cyfs-bdt\src\tunnel\builder\connect_stream\builder.rs:328] ConnectStreamBuilder{stream:StreamContainer {sequence:TempSeq(4211602500), local:5bnZHzZXgN4JsbKaFsqZ75HczHHh5dsrcAh9Ake4wKwA, remote:5bnZHzXqdRwun6NkzMgksUirAdspUnUwLBFYG91QC55p, port:84, id:1638150600 }} call sn session failed, sn=5bnZVFY5EYo6LXxrUKahLTEYqSExZZ7tkFvEDwfyojMt, err=err: (NotFound, sn response error, None)
Task is blocked, so all the packets can no be handle:
[2023-05-06 20:49:24.056021 +08:00] INFO [ThreadId(6)] [component\cyfs-bdt\src\tunnel\udp.rs:148] UdpTunnel{local:L4udp192.168.100.75:8051,remote:L4udp192.168.100.75:8050} dead for connecting timeout
When the check\dead.rs detect this situation, the gateway will be restart? If so, the device an be connected after 5min.
from cyfs.
Related Issues (20)
- Nighlt OOD preview version has an incomplete update HOT 6
- After App-manager restarted the DEC APP installation process, the DEC APP was not started HOT 3
- Linux app-manager restore installation failed when app-manager breaks down when DEC APP is running "npm i" HOT 4
- Optimize service publishing process HOT 1
- How to config handler`s filter? HOT 3
- If ACL Handler response error,satck.root_state_accessor_stub.get_object_by_path() will not return HOT 8
- The `ping` with `sn` is stopped? HOT 2
- Systemctl status of ood-daemon.service is error HOT 8
- Create restore task not save archive file to disk HOT 1
- Stable sort for HashSet HOT 4
- Only web dec_app install failed HOT 1
- Customizing system info HOT 3
- Backup service supports conditional filtering of key data HOT 2
- Release schedule for 84
- Test schedule for 84 Release
- Problems with ood-daemon restore backup data HOT 3
- Query task group state failed in rust. HOT 3
- Method to get an object from MetaChain by body-hash to get a specific version. HOT 1
- Unsound `transmute` in safe method
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cyfs.