Comments (9)
Is it safe?
It's received some degree of testing and some validators run in this configuration.
Have any further developments or improvement been made?
Not yet. We'll be switching to gRPC for validator <-> TMKMS connections soon (#73), after which TMKMS will track an "active" validator node and the others will be passive until the active validator fails.
After this migration is completed, we'll look into HA for TMKMS itself.
from tmkms.
@pratikbin it's intended and semi-supported to allow multiple concurrent validators. We don't recommend that but it's been tested and no one has reported problems yet.
In that case they're signing the same commit hashes. It's deliberately supported to be able to resign the exact same hash at the exact same h/r/s for fault tolerance purposes. The signature process is deterministic and this will result in the same signature on the same proposal, which doesn't count as double signing.
In the event multiple validators send conflicting proposals, the first validator will "win" and the other validator will receive a double signing error
from tmkms.
@albttx AFAIK, It won't join p2p with same node_key
since it's tendermint p2p key
from tmkms.
We've largely been waiting for a migration to gRPC, which will reverse the client/server relationship between the KMS and validator nodes. Instead of having to explicitly configure several validators for the KMS to connect to, multiple validators can connect to the KMS.
That's tracked here: cometbft/cometbft#476
from tmkms.
I was also thinking about this solution, it seems safer to me even in case of any bugs.
TMKMS01+HSM --+> VALIDATOR +--> SENTRY1
TMKMS02+HSM --| |--> SENTRY2
|--> SENTRY3
(TMKSM01 and TMKSM02 "state_file" are NOT in sync)
Since a VALIDATOR node accepts only one Tendermint connection (from an external PrivValidator process),
this could be a way to create a redundancy for TMKMS.
A disaster recovery in case of VALIDATOR fault could be connecting one of two TMKMS to a SENTRY.
What do you think about it?
from tmkms.
The migration to gRPC reverses the direction of the connection, so validators connect to TMKMS rather than the other way around.
We'll likely want to deprecate/phase out the current "secret connection"-based approach.
from tmkms.
It's received some degree of testing and some validators run in this configuration.
I've setup 2 validators 1 tmkms, and From logs, I can see tmkms responding to both validators with same key. So looks like it is risky for now?!
testnet2 | 12:47PM INF committed state app_hash=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4 height=3413176 module=state num_txs=1
tmkms | 2022-12-07T12:47:44.943746Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] received request: ShowPublicKey(PubKeyRequest)
tmkms | 2022-12-07T12:47:44.943764Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] sending response: PublicKey(PubKeyResponse { pub_key_ed25519: [74, 21, 224, 140, 241, 58, 2, 66, 174, 235, 92, 12, 46, 136, 122, 138, 1, 185, 116, 106, 248, 39, 144, 141, 43, 121, 23, 2, 181, 84, 236, 248] })
testnet3 | 12:47PM INF commit synced commit=436F6D6D697449447B5B323234203437203320322031303120352032303820313320313836203139342038302034332032333120302032333620373320383520313931203235332039312032313920323035203337203634203238203135322031353420313831203136352031303220313830203138305D3A3334313442387D
testnet3 | 12:47PM INF committed state app_hash=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4 height=3413176 module=state num_txs=1
testnet2 | 12:47PM INF indexed block height=3413176 module=txindex
tmkms | 2022-12-07T12:47:44.951079Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] received request: ShowPublicKey(PubKeyRequest)
tmkms | 2022-12-07T12:47:44.951103Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] sending response: PublicKey(PubKeyResponse { pub_key_ed25519: [74, 21, 224, 140, 241, 58, 2, 66, 174, 235, 92, 12, 46, 136, 122, 138, 1, 185, 116, 106, 248, 39, 144, 141, 43, 121, 23, 2, 181, 84, 236, 248] })
testnet3 | 12:47PM INF indexed block height=3413176 module=txindex
tmkms | 2022-12-07T12:47:48.277496Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] received request: ReplyPing(PingRequest)
tmkms | 2022-12-07T12:47:48.277545Z DEBUG tmkms::session: [mantle-1@tcp://testnet2:1234] sending response: Ping(PingResponse)
tmkms | 2022-12-07T12:47:48.284968Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] received request: ReplyPing(PingRequest)
tmkms | 2022-12-07T12:47:48.285011Z DEBUG tmkms::session: [mantle-1@tcp://testnet3:1234] sending response: Ping(PingResponse)
testnet2 | 12:47PM INF Timed out dur=4912.38743 height=3413177 module=consensus round=0 step=1
testnet3 | 12:47PM INF Timed out dur=4917.999917 height=3413177 module=consensus round=0 step=1
testnet3 | 12:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E","parts":{"hash":"8F16936C7F74ECFD600C14CDC7A2277812FF5CBEA4580A11F84E7C017757A60C","total":1}},"height":3413177,"pol_round":-1,"round":0,"signature":"NgrA5gmuOw812cIAGc2Ef4HGGmr8I4iZeZyRrft8HOl6DWgYa/SkSFnq+v6pp6j3196KdgLkHrScj7hd17M4BQ==","timestamp":"2022-12-07T12:47:49.860448507Z"}
testnet3 | 12:47PM INF received complete proposal block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus
testnet2 | 12:47PM INF received proposal module=consensus proposal={"Type":32,"block_id":{"hash":"D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E","parts":{"hash":"8F16936C7F74ECFD600C14CDC7A2277812FF5CBEA4580A11F84E7C017757A60C","total":1}},"height":3413177,"pol_round":-1,"round":0,"signature":"NgrA5gmuOw812cIAGc2Ef4HGGmr8I4iZeZyRrft8HOl6DWgYa/SkSFnq+v6pp6j3196KdgLkHrScj7hd17M4BQ==","timestamp":"2022-12-07T12:47:49.860448507Z"}
testnet2 | 12:47PM INF received complete proposal block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus
testnet2 | 12:47PM INF finalizing commit of block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus num_txs=0 root=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4
testnet3 | 12:47PM INF finalizing commit of block hash=D968C5B0F77373DF954F631565CF620471050CD5A8CCFFA80EE51984CD2D063E height=3413177 module=consensus num_txs=0 root=E02F03026505D00DBAC2502BE700EC4955BFFD5BDBCD25401C989AB5A566B4B4
testnet2 | 12:47PM INF minted coins from module account amount=72399106umntl from=mint module=x/bank
testnet2 | 12:47PM INF executed block height=3413177 module=state num_invalid_txs=0 num_valid_txs=0
testnet2 | 12:47PM INF commit synced commit=436F6D6D697449447B5B3237203231342031333120323232203239203137392031333020313036203133362039372038392031383820313833203638203131372031352031343420323033203133362031383220313139203133352034203138332032333320313934203539203133302031393920313434203735203234345D3A3334313442397D
testnet2 | 12:47PM INF committed state app_hash=1BD683DE1DB3826A886159BCB744750F90CB88B6778704B7E9C23B82C7904BF4 height=3413177 module=state num_txs=0
testnet3 | 12:47PM INF minted coins from module account amount=72399106umntl from=mint module=x/bank
testnet3 | 12:47PM INF executed block height=3413177 module=state num_invalid_txs=0 num_valid_txs=0
from tmkms.
Hello,
I'm interested by running this kind of setup.
- 1
tmkms
runned by an orchestrator for redundancy - multiple validators node connected to
tmkms
for HA.
But i'm questioning for this architecture about the node_key.json
.
Should i set 2 nodes with the same node_key
and have a config like:
[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true
[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true
or set different node_key
[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true
[[validator]]
chain_id = "cosmoshub-3"
addr = "tcp://[email protected]:26658"
secret_key = "/root/config/secrets/kms-identity.key"
protocol_version = "legacy"
reconnect = true
Any ETA on a HA status update ?
from tmkms.
@tony-iqlusion is there any news about HA?
Or could you review and support configurations like the previous ones? (from @albttx)
I'm testing Horcrux for the first time and it do that.. TMKMS keeps closing the connection (prevent double-sign)
Thanks
from tmkms.
Related Issues (20)
- Support multiple YubiHSMs per process HOT 1
- Prometheus metrics HOT 5
- Ready for ICS? HOT 1
- Parse tmkms.toml error HOT 1
- tmkms fatal error: parse error: invalid character `.` HOT 1
- Support for Consumer Chains? HOT 1
- Protobuf: buffer underflow HOT 72
- Hardware requirements HOT 2
- Allow restore from 4-character mnemonics HOT 1
- tmkms vs yubihsm2 sdk version dependency HOT 3
- Enable support for secp256k1 consensus key HOT 24
- Is secret_key = "path/to/secret_connection.key" Required? HOT 4
- quicksilverd "failed to process message" errors for TMKMS signed votes HOT 4
- Protocol error: and USB error HOT 3
- Privval protocol incompatibility with namada node HOT 5
- consensus failure on cosmos-sdk v0.50.2 and comet v0.38.2 HOT 12
- Add support for vote extensions in CometBFT 0.38 or greater HOT 1
- Transaction signing from a custom application HOT 7
- Compatibility with simapp on cometbtf 0.38.5 HOT 1
- Resource temporarily unavailable (os error 11) during failovers HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tmkms.