Code Monkey home page Code Monkey logo

Comments (9)

tony-iqlusion avatar tony-iqlusion commented on August 10, 2024 1

Is it safe?

It's received some degree of testing and some validators run in this configuration.

Have any further developments or improvement been made?

Not yet. We'll be switching to gRPC for validator <-> TMKMS connections soon (#73), after which TMKMS will track an "active" validator node and the others will be passive until the active validator fails.

After this migration is completed, we'll look into HA for TMKMS itself.

from tmkms.

tony-iqlusion avatar tony-iqlusion commented on August 10, 2024 1

@pratikbin it's intended and semi-supported to allow multiple concurrent validators. We don't recommend that but it's been tested and no one has reported problems yet.

In that case they're signing the same commit hashes. It's deliberately supported to be able to resign the exact same hash at the exact same h/r/s for fault tolerance purposes. The signature process is deterministic and this will result in the same signature on the same proposal, which doesn't count as double signing.

In the event multiple validators send conflicting proposals, the first validator will "win" and the other validator will receive a double signing error

from tmkms.

pratikbin avatar pratikbin commented on August 10, 2024 1

@albttx AFAIK, It won't join p2p with same node_key since it's tendermint p2p key

from tmkms.

tony-iqlusion avatar tony-iqlusion commented on August 10, 2024 1

We've largely been waiting for a migration to gRPC, which will reverse the client/server relationship between the KMS and validator nodes. Instead of having to explicitly configure several validators for the KMS to connect to, multiple validators can connect to the KMS.

That's tracked here: cometbft/cometbft#476

from tmkms.

ctosae avatar ctosae commented on August 10, 2024

I was also thinking about this solution, it seems safer to me even in case of any bugs.

TMKMS01+HSM --+> VALIDATOR +--> SENTRY1
TMKMS02+HSM --|            |--> SENTRY2
                           |--> SENTRY3

(TMKSM01 and TMKSM02 "state_file" are NOT in sync)

Since a VALIDATOR node accepts only one Tendermint connection (from an external PrivValidator process),
this could be a way to create a redundancy for TMKMS.

A disaster recovery in case of VALIDATOR fault could be connecting one of two TMKMS to a SENTRY.

What do you think about it?

from tmkms.

tony-iqlusion avatar tony-iqlusion commented on August 10, 2024

The migration to gRPC reverses the direction of the connection, so validators connect to TMKMS rather than the other way around.

We'll likely want to deprecate/phase out the current "secret connection"-based approach.

from tmkms.

pratikbin avatar pratikbin commented on August 10, 2024

It's received some degree of testing and some validators run in this configuration.

I've setup 2 validators 1 tmkms, and From logs, I can see tmkms responding to both validators with same key. So looks like it is risky for now?!


testnet2  | 12:47PM INF committed state app_hash=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4 height=3413176 module=state num_txs=1
tmkms     | 2022-12-07T12:47:44.943746Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] received request: ShowPublicKey(PubKeyRequest)
tmkms     | 2022-12-07T12:47:44.943764Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] sending response: PublicKey(PubKeyResponse { pub_key_ed25519: [74, 21, 224, 140, 241, 58, 2, 66, 174, 235, 92, 12, 46, 136, 122, 138, 1, 185, 116, 106, 248, 39, 144, 141, 43, 121, 23, 2, 181, 84, 236, 248] })
testnet3  | 12:47PM INF commit synced commit=436F6D6D697449447B5B323234203437203320322031303120352032303820313320313836203139342038302034332032333120302032333620373320383520313931203235332039312032313920323035203337203634203238203135322031353420313831203136352031303220313830203138305D3A3334313442387D
testnet3  | 12:47PM INF committed state app_hash=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4 height=3413176 module=state num_txs=1
testnet2  | 12:47PM INF indexed block height=3413176 module=txindex
tmkms     | 2022-12-07T12:47:44.951079Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] received request: ShowPublicKey(PubKeyRequest)
tmkms     | 2022-12-07T12:47:44.951103Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] sending response: PublicKey(PubKeyResponse { pub_key_ed25519: [74, 21, 224, 140, 241, 58, 2, 66, 174, 235, 92, 12, 46, 136, 122, 138, 1, 185, 116, 106, 248, 39, 144, 141, 43, 121, 23, 2, 181, 84, 236, 248] })
testnet3  | 12:47PM INF indexed block height=3413176 module=txindex

tmkms     | 2022-12-07T12:47:48.277496Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] received request: ReplyPing(PingRequest)
tmkms     | 2022-12-07T12:47:48.277545Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] sending response: Ping(PingResponse)
tmkms     | 2022-12-07T12:47:48.284968Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] received request: ReplyPing(PingRequest)
tmkms     | 2022-12-07T12:47:48.285011Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] sending response: Ping(PingResponse)
testnet2  | 12:47PM INF Timed out dur=4912.38743 height=3413177 module=consensus round=0 step=1
testnet3  | 12:47PM INF Timed out dur=4917.999917 height=3413177 module=consensus round=0 step=1
testnet3  | 12:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E","parts":{"hash":"8F16936C7F74ECFD600C14CDC7A2277812FF5CBEA4580A11F84E7C017757A60C","total":1}},"height":3413177,"pol_round":-1,"round":0,"signature":"NgrA5gmuOw812cIAGc2Ef4HGGmr8I4iZeZyRrft8HOl6DWgYa/SkSFnq+v6pp6j3196KdgLkHrScj7hd17M4BQ==","timestamp":"2022-12-07T12:47:49.860448507Z"}
testnet3  | 12:47PM INF received complete proposal block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus
testnet2  | 12:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E","parts":{"hash":"8F16936C7F74ECFD600C14CDC7A2277812FF5CBEA4580A11F84E7C017757A60C","total":1}},"height":3413177,"pol_round":-1,"round":0,"signature":"NgrA5gmuOw812cIAGc2Ef4HGGmr8I4iZeZyRrft8HOl6DWgYa/SkSFnq+v6pp6j3196KdgLkHrScj7hd17M4BQ==","timestamp":"2022-12-07T12:47:49.860448507Z"}
testnet2  | 12:47PM INF received complete proposal block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus
testnet2  | 12:47PM INF finalizing commit of block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus num_txs=0 root=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4
testnet3  | 12:47PM INF finalizing commit of block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus num_txs=0 root=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4
testnet2  | 12:47PM INF minted coins from module account amount=72399106umntl from=mint module=x/bank
testnet2  | 12:47PM INF executed block height=3413177 module=state num_invalid_txs=0 num_valid_txs=0
testnet2  | 12:47PM INF commit synced commit=436F6D6D697449447B5B3237203231342031333120323232203239203137392031333020313036203133362039372038392031383820313833203638203131372031352031343420323033203133362031383220313139203133352034203138332032333320313934203539203133302031393920313434203735203234345D3A3334313442397D
testnet2  | 12:47PM INF committed state app_hash=1BD683DE1DB3826A886159BCB744750F90CB88B6778704B7E9C23B82C7904BF4 height=3413177 module=state num_txs=0
testnet3  | 12:47PM INF minted coins from module account amount=72399106umntl from=mint module=x/bank
testnet3  | 12:47PM INF executed block height=3413177 module=state num_invalid_txs=0 num_valid_txs=0

from tmkms.

albttx avatar albttx commented on August 10, 2024

Hello,

I'm interested by running this kind of setup.

  • 1 tmkms runned by an orchestrator for redundancy
  • multiple validators node connected to tmkms for HA.

But i'm questioning for this architecture about the node_key.json .

Should i set 2 nodes with the same node_key

and have a config like:

[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true

[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true

or set different node_key

[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true

[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true

Any ETA on a HA status update ?

from tmkms.

activenodes avatar activenodes commented on August 10, 2024

@tony-iqlusion is there any news about HA?
Or could you review and support configurations like the previous ones? (from @albttx)
I'm testing Horcrux for the first time and it do that.. TMKMS keeps closing the connection (prevent double-sign)
Thanks

from tmkms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.