Code Monkey home page Code Monkey logo

Comments (4)

LordDarkHelmet avatar LordDarkHelmet commented on July 18, 2024

Here is an example of a successful killing and restarting of a Dynode with an unresponsive CLI. In this case the Dynode state is set to ACTIVE_DYNODE_STARTED and accordingly a Dynode ping is sent.

... (Last known Dynode Ping before restart)
2019-12-17 12:27:35 CActiveDynode::SendDynodePing -- Relaying ping, collateral=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-1
... (Killed then restarted due to non-responsive CLI)
2019-12-17 12:39:12 


2019-12-17 12:39:12 Dynamic version v2.4.3.0-8873e60b9

2019-12-17 12:39:12 Bound to xxx.xxx.xxx.xxx:33300
2019-12-17 12:39:12 AddLocal(xxx.xxx.xxx.xxx:33300,4)

2019-12-17 12:39:15 msghand thread start
2019-12-17 12:39:15 StartDHTNetwork -- starting
2019-12-17 12:39:15 Imported mempool transactions from disk: 0 successes, 0 failed, 3 expired
2019-12-17 12:39:15 AcceptConnection -- dynode is not synced yet, skipping inbound connection attempt
2019-12-17 12:39:26 Loading addresses from DNS seeds (could take a while)
2019-12-17 12:39:33 AcceptConnection -- dynode is not synced yet, skipping inbound connection attempt
2019-12-17 12:39:36 AcceptConnection -- dynode is not synced yet, skipping inbound connection attempt
2019-12-17 12:39:36 16 addresses found from DNS seeds
2019-12-17 12:39:36 dnsseed thread exit
2019-12-17 12:39:45 CDynode::Check -- Dynode 557713aae29fc0dff460f8cbed9712f8eaa6073ba7dcfaa302273c7969030328-3 is unbanned and back in list now
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=9c8536bf91b4ef43a70f0f3ebdf7488b9589b7898a161a86e0eebe97d17c0f00-0  blockHash=000000000620bed8c8b522a7e69cf1a88f4b04f8f21047116df6f52810351414
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=107.145.181.3:33300
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=9d673858b5bfa82e59f13374aba3312f00577a6b7d14fbf35671670533733200-0  blockHash=000000034d0df9641f28da991a82c64c3e004d69d8ff5a12e84695b639e4b81a
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=45.86.68.200:33300
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=95.217.91.94:33300
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=95.217.91.56:33300
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=107.145.181.5:33300
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=112.194.104.101:33300
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=457e8b4116e227be15429145b5308554c3ff65bf2de3f4ea7a1b46c77d9e5d01-1  blockHash=0000000219e41a8b321c6a3da82a15bbc49699db3036c65cbc7a499812cbc23d
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=95.217.91.89:33300
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=223.186.188.131:33300
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=c59d892eb0298977a45125fcc51b0dadc3ea4b7cb7ef7e73d0061e79d09d5e02-1  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=c59d892eb0298977a45125fcc51b0dadc3ea4b7cb7ef7e73d0061e79d09d5e02-1  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=223.186.188.82:33300
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=37d81a5517f5a9abb1c6683404ce482791ab0572329426a4e795507956d7b502-0  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=37d81a5517f5a9abb1c6683404ce482791ab0572329426a4e795507956d7b502-0  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=223.186.188.126:33300
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=ad6aa8a09fff639da78338a769f481de0b78925da21b28ef155af0a22223df03-0  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=ad6aa8a09fff639da78338a769f481de0b78925da21b28ef155af0a22223df03-0  blockHash=000000002c82f162c245b0f8bba665c978294f120ac587f7c909b3a1597da10f
2019-12-17 12:39:45 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=95.217.91.109:33300
...(removed 1060 lines of the same pattern for simplicity.)
2019-12-17 12:39:46 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=212.24.103.6:33300
2019-12-17 12:39:46 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=8c0ce22ab2b2f5ac2733d2955163ad5fdf6aac8901895b44165cf2648aa0f3e2-0  blockHash=00000000271f196a2dede805597f15ab519faa17bfac2a09fa158494a8e56607
2019-12-17 12:39:46 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=8c0ce22ab2b2f5ac2733d2955163ad5fdf6aac8901895b44165cf2648aa0f3e2-0  blockHash=00000000fbe92855803d8c54d2c31e3a94c8925d0f01c61b10f9bb63efb2062f
2019-12-17 12:39:46 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=80.211.17.55:33300
2019-12-17 12:39:46 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=bbc782cb6b8bff35303ae59c588b9e046d2fd9ef160dc6de59a912af01627ae3-1  blockHash=00000000301d39b3247c157a26bbd4b1f93fb16799081aea1f5f9096dd385585
2019-12-17 12:39:46 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=112.194.101.132:33300
2019-12-17 12:39:46 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=4edd434433905f4f6877a2c0a685ae45636a01c64c765e5efea07b05def3a5e3-0  blockHash=000000011f48ddda1d1404189013c16ff48dab97bdb099bba1700beef8ba87e3
2019-12-17 12:39:46 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=112.194.101.133:33300
2019-12-17 12:39:46 CDynodePing::CheckAndUpdate -- Dynode ping is invalid, block hash is too old: dynode=35cdce0837edc9d593825cb49a7f2bdfa88552105fa4ce9c346e042140c82ae4-0  blockHash=000000005cca54917df42737cf14cc08ff7265b9d4c8d5537bf3c67cb99a20f3
2019-12-17 12:39:46 CDynodeBroadcast::Update -- Got UPDATED Dynode entry: addr=xxx.xxx.xxx.xxx:33300
2019-12-17 12:39:46 CActiveDynode::ManageStateInitial -- Checking inbound connection to 'xxx.xxx.xxx.xxx:33300'
2019-12-17 12:39:46 AcceptConnection -- dynode is not synced yet, skipping inbound connection attempt
2019-12-17 12:39:46 CActiveDynode::ManageStateRemote -- STARTED!
2019-12-17 12:39:46 CActiveDynode::SendDynodePing -- Relaying ping, collateral=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-1

from dynamic.

LordDarkHelmet avatar LordDarkHelmet commented on July 18, 2024

I am getting more and more reports of Dynodes crashing, then on immediate restart, they go into the NEW_START_REQUIRED state. By all accounts this should not be happening, but it is. This is happening with a single Dynode running on a dedicated VPS. I rewrote a script for Edgemaster that detects stalled Dynodes, stops them, deletes the Dynode cache file, then immediately restarts the Dynode in the hope that getting rid of old Dynode cache data would help. It defiantly improved the situation, but has not solved it.

Then I was thinking, if the cache file is gone, it is rebuilding its own state from the network. We know that from any control wallet that the state of a Dynode not guaranteed to be correct. Any nodes data about other nodes could be out of date and we should only trust data from the Dynode itself. But those nodes with out of date information are sending out reports. If our Dynode has incorrect data in its own cache file, or if that cache file is deleted the first report about our Dynode contains incorrect state data then our Dynode could slip into a DYNODE_NEW_START_REQUIRED state.

The only place in the code where you can get into the DYNODE_NEW_START_REQUIRED state is in dynode.cpp on line 209.

What are the consequences of ignoring external data and not going into the new state required? At this point I am trusting others to give me correct information about myself. (Even just a single node giving me incorrect or out of date information)

Perhaps a solution is to only change the state when the sync is finished. This will allow time for more than one report to arrive, and hopefully the data in those reports are accurate.

    // don't expire if we are still in "waiting for ping" mode unless it's our own dynode
    if (!fWaitForPing || fOurDynode) {
        if (!IsPingedWithin(DYNODE_NEW_START_REQUIRED_SECONDS)) {
            if (dynodeSync.IsDynodeListSynced()) {
                nActiveState = DYNODE_NEW_START_REQUIRED;
                if (nActiveStatePrev != nActiveState) {
                    LogPrint("dynode", "CDynode::Check -- Dynode %s is in %s state now\n", outpoint.ToStringShort(), GetStateString());
                }
            }
            return;
        }
...

This still requires trust, so the possibility that malicious nodes that send stale/incorrect data in an attempt to bring down nodes that are restarting still exists.

The core of the issue seems to be corrupted or out of date information, this does not fix that issue, but it should make it more resilient to its presence.

I can create a pull request for the above, but wanted some feedback before I do that. Unfortunately, the issue is very hard to reproduce for debugging, so discussing this change and potential side effects is probably the best thing to do right now.

from dynamic.

Duality-CDOO avatar Duality-CDOO commented on July 18, 2024

We could attempt to go through attempting to rectify this issue, however, with the hardfork happening in 2.5.0.0 for Proof of Stake, it might simply be worth waiting until the network is up on that version and then revisiting this then.

from dynamic.

LordDarkHelmet avatar LordDarkHelmet commented on July 18, 2024

With @AmirAbrams 's recent changes It may be a good opportunity to look at what is happening with dynode invalidation here.

Perhaps the psudo code I suggested above might help the situation. Right now it is possible for another Dynode who has out of date information or is acting maliciously to move a just rebooted Dynode into the DYNODE_NEW_START_REQUIRED state by sending old information.

from dynamic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.