status-im / nim-eth Goto Github PK
View Code? Open in Web Editor NEWCommon utilities for Ethereum
Home Page: https://nimbus.status.im
License: Apache License 2.0
Common utilities for Ethereum
Home Page: https://nimbus.status.im
License: Apache License 2.0
As discussed on gitter, there are some transport leaks currently. Creating this issue so that there is some traceability and that you know that I'm looking into it.
Basically on every disconnect()
, unless it is followed specifically with a peer.transport.close()
(which is never?), we leak transports.
And also on a dropped connection (https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx.nim#L518) it will not be closed currently.
Transport does get closed when a disconnect message is received: https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx.nim#L526
Or otherwise ameliorate these:
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_backends_helpers.nim(33, 6) Hint: 'initProtocolStates' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_backends_helpers.nim(41, 6) Hint: 'resolveFuture' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_backends_helpers.nim(46, 6) Hint: 'requestResolver' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_backends_helpers.nim(78, 6) Hint: 'linkSendFailureToReqFuture' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_backends_helpers.nim(93, 6) Hint: 'handshakeImpl' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_protocol_dsl.nim(288, 15) Hint: 'CurrentProtocol' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_protocol_dsl.nim(292, 17) Hint: 'perProtocolMsgId' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_protocol_dsl.nim(299, 16) Hint: 'state' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_protocol_dsl.nim(304, 16) Hint: 'networkState' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_protocol_dsl.nim(593, 20) template/generic instantiation of `makeEth2Request` from here
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(87, 12) Hint: 'initTracing' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(89, 12) Hint: 'logSentMsg' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(90, 12) Hint: 'logReceivedMsg' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(91, 12) Hint: 'logConnectedPeer' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(92, 12) Hint: 'logAcceptedPeer' is declared but not used [XDeclaredButNotUsed]
nim-beacon-chain/vendor/nim-eth/eth/p2p/p2p_tracing.nim(93, 12) Hint: 'logDisconnectedPeer' is declared but not used [XDeclaredButNotUsed]
Embarking on my first nim-devp2p hello world adventure, I was stumped for awhile by the following situation:
import
eth/p2p, chronos, chronicles, chronos/timer
const
bzzVersion = 42
p2pProtocol Bzz(version = bzzVersion):
proc hellooo(peer: Peer,
foo: string) =
echo foo
onPeerConnected do (peer: Peer):
warn "conn"
if peer.supports(Bzz):
warn "bzz"
waitFor peer.hellooo("bar")
waitFor sleepAsync(1000)
Compiling this returned the following error:
/home/lash/src/status/nim-eth/eth/rlp.nim(377, 18) Error: type mismatch: got <Rlp, type string>
but expected one of:
proc read[T](future: Future[T] | FutureVar[T]): T
first type mismatch at position: 1
required type: Future[read.T] or FutureVar[read.T]
but expression 'rlp' is of type: Rlp
proc read(rstream: AsyncStreamReader; n = 0): Future[seq[byte]]
first type mismatch at position: 1
required type: AsyncStreamReader
but expression 'rlp' is of type: Rlp
proc read(transp: StreamTransport; n = -1): Future[seq[byte]]
first type mismatch at position: 1
required type: StreamTransport
but expression 'rlp' is of type: Rlp
expression: read(rlp, type(result.foo))
Turns out the p2pProtocol
is a macro, and it inserts code that refers to the rlp
namespace. Adding eth/rlp
to the imports in the code solved the error.
It was however not so easy to understand the error context, so maybe it could be improved.
All implementations of TrieDatabaseRef
raise exceptions but have exception safety issues - in particular, resources will not be released / aborted / committed correctly leading to deadlocks / data loss.
Examples where a raise will leak or break - the code in general needs a full review, these are just samples:
nim-eth/eth/trie/backends/lmdb_backend.nim
Line 113 in 3ee5651
Split current Whisper code into message structure and purely protocol code.
This avoids duplicated code & tests for other versions/forks of the protocol such as the one Status uses (soon Waku/0)
The Nim VM fails to zero an object field in a const
assignment involving a proc call.
Nim code:
proc initNibbleRange*(bytes: ByteRange): NibblesRange =
result.bytes = bytes
result.ibegin = 0
result.iend = bytes.len * 2> Does this sound familiar to anyone? The Nim VM fails to zero an object field in a `const` assignment involving a proc call.
Nim code:
```nim
proc initNibbleRange*(bytes: ByteRange): NibblesRange =
result.bytes = bytes
result.ibegin = 0
result.iend = bytes.len * 2
const zeroNibblesRange* = initNibbleRange(zeroBytesRange)
Now this constant is imported and used in another file. This is how the generated C looks for the latter:
NIM_CONST tyObject_NibblesRange_NTijGP6lMbjliuMIEehBLw TM_9aIpn29c4nFUFIFQM9b2UMu5w_4 = {{NIM_NIL, NIM_NIL, ((NI) 0)}, ((NI) 0), ((NI) IL64(279856062208720))};
We suddenly have uninitialised memory in the iend
field instead of 0.
This is a hard to replicate bug. I tried copying the relevant Nim code in the same file or in separate files (in case it's an import issue) to make a stand-alone test case, without any luck.
The best test case I can come up with needs nim-eth:
import eth/trie/[trie_defs,nibbles]
echo zeroBytesRange.len
echo zeroNibblesRange.len
let nr = initNibbleRange(ByteRange())
echo nr.len
The obvious workaround is to replace the const
with a let
, but that won't solve the underlying VM bug.
const zeroNibblesRange* = initNibbleRange(zeroBytesRange)
Now this constant is imported and used in another file. This is how the generated C looks for the latter:
```C
NIM_CONST tyObject_NibblesRange_NTijGP6lMbjliuMIEehBLw TM_9aIpn29c4nFUFIFQM9b2UMu5w_4 = {{NIM_NIL, NIM_NIL, ((NI) 0)}, ((NI) 0), ((NI) IL64(279856062208720))};
We suddenly have uninitialised memory in the iend
field instead of 0.
This is a hard to replicate bug. I tried copying the relevant Nim code in the same file or in separate files (in case it's an import issue) to make a stand-alone test case, without any luck.
The best test case I can come up with needs nim-eth:
import eth/trie/[trie_defs,nibbles]
echo zeroBytesRange.len
echo zeroNibblesRange.len
let nr = initNibbleRange(ByteRange())
echo nr.len
Example output (the non-zero value changes with each compilation):
0
280895550172144
0
The obvious workaround is to replace the const
with a let
, but that won't solve the underlying VM bug.
It's possible to have multiple connections with the same Peer
:
Easy scenario:
disconnect
is done due to duplicate connection. Remove peer from peer_pool.connectedNodes
is done.connectedNodes
I think the same is possible with first an incoming connection for Peer A and next an outgoing. It is just more difficult to achieve as there is an initial check here: https://github.com/status-im/nim-eth/blob/master/eth/p2p/peer_pool.nim#L62
However, due to the async behaviour of rlpxConnect
and rlpxAccept
, I think this could still occur the other way around.
To resolve this the following could be done:
Do not let the disconnect
call remove the peer from the connectedNodes
in the peer pool in case DisconnectionReason
is AlreadyConnected
.
Other options are to check within rlpxAccept
and rlpxConnect
itself if the node is already in connectedNodes
or connectingNodes
. By the way, in case of an incoming connection, connectingNodes
is never set, it is true however that in this case the actual node (pub key) is known much later. Also, even when moving the check to rlpxAccept
and rlpxConnect
, there is still a chance to have the problem will occur (due to the last await) unless the connectedNodes
add is moved right after the check.
So I guess the question basically is, in case of duplicate connection: Do we allow the full connection (including all sub protocol connection handlers) to happen and then afterwards need to disconnect (including running all sub protocol disconnection handlers). Or do we just stop rlpxAccept
or rlpxConnect
the moment we know it is a duplicate peer (e.g. before postHelloSteps
)? The latter will be more intrusive as change.
Thoughts?
Testing simulation that is done here status-im/nimbus-eth1#437 revealed that the traffic sending node has more valid_envelopes
than there are actually being send. Investigate how these duplicates got added.
This ticket describes issues with the current code, when running behind a nat:
A and B connect to each other on local network - they will have each others 'X.0' addresses in the DHT. Now C connects to A because it had a bootstrap address with X.1:50000
, and is unable to connect to node B, because there is no DHT entry with X.1
- A only sees B as X.0:50001
FIREWALL
+----+
| |
| | +-----------------+
| | | |
| | | |
| | | A |
| +-------+ |
| | | |
| | | |
| | | X.0:50000 |
+-----------+ | | +-----------------+
| | | | |
| | | | |
| | | | |
| C +------------+X.1 | |
| | | | |
| | | | |
| | | | |
+-----------+ | | |
| | |
| | +-----------------+
| | | X.0:50001 |
| | | |
| | | B |
| +-------+ |
| | | |
| | | |
| | | |
| | | |
| | +-----------------+
| |
+----+
When having a whisper node it could be useful to have some metrics such as:
Perhaps also metrics to be able to assess all incoming messages versus messages that are requested specifically through filters.
Right now, a default serializer will swallow any type that's passed to RLP - this makes the serialization easy to break by simply forgetting an import, as can be seen in status-im/nimbus-eth2#144
Instead of having an overload that silently swallows mistakes, make enabling of the default serializer explicit. Consider both intrusive (pragma) and non-intrusive (for 3rd party types) options .
Use the size of short_rlp(envelope)
as length. As is done now in latest version of geth and Status whisper fork.
See also #71
Somewhat related to #63
Basically, the check/disconnect could also be moved to dispatchMessages
. But then the peer will still briefly be added to the peer_pool.
The disconnect should not be fully done, no removePeer
is needed.
In general removePeer
could be moved to dispatchMessages
(or out of rlpx )
Currently in our implementation, the maximum connected peers limit is only applied on the creation of new outgoing connections.
The listener for incoming connections remains active. This gives a serious risk and should be handled by either:
disconnect
message with ToManyPeers
as DisconnectionReason
I'm guessing when going from 1. down, the possibility of being seen as a "bad" peer will go up. However, practically, I'm not sure if any client actually cares.
At https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx.nim#L283
Already gets catched but reraised there, so crash can occur still if not properly handled higher up.
E.g.
Async traceback:
nimbus.nim(147) nimbus
nimbus.nim(126) process
asyncloop.nim(266) poll
asyncmacro2.nim(36) sendMsg_continue
rlpx.nim(283) sendMsgIter
asyncfutures2.nim(377) read
#[
nimbus.nim(147) nimbus
nimbus.nim(126) process
asyncloop.nim(266) poll
asyncmacro2.nim(39) sendMsg_continue
excpt.nim(138) sendMsgIter
]#
#[
nimbus.nim(147) nimbus
nimbus.nim(126) process
asyncloop.nim(266) poll
asyncmacro2.nim(36) sendMsg_continue
excpt.nim(138) sendMsgIter
]#
Exception message: (32) Broken pipe
Exception type: [Exception]
Could probably just not reraise there, but perhaps has some side-effects (possible other error cases?).
Decide on whether we want to change our Whisper implementation to allow this.
More information here: vacp2p/rfc#28 (comment)
For Waku/0 we will most likely have to change this.
I found 'payloadBytesCount' in eth/rlp.nim behaves differently when compiled to 32bit. The spec doesn't say anything about decoding to platform dependent 'int'.
it's a bit confusing whether to fail on 32 bit or 64 bit. need deeper investigation.
@zah now when more better stack traces available
DBG 2019-04-17 15:02:32+02:00 GossipSub message not delivered to Peer tid=23109 peer=Node[<hidden ip address>:9100]
Error: unhandled exception: An attempt was made to complete a Future more than once.
Details: Future ID: 129240
Creation location:
../../nim-eth/eth/p2p/p2p_backends_helpers.nim(65) [unspecified]
First completion location:
../../nim-eth/eth/p2p/rlpx.nim(183) [unspecified]
Second completion location:
../../nim-eth/eth/p2p/rlpx.nim(270) [unspecified]
Stack trace to moment of creation:
No stack traceback available
Stack trace to moment of secondary completion:
No stack traceback available
[FutureError]
If I set wakuVersion
(or whisperVersion
) to 0 like this: https://github.com/oskarth/nim-eth/blob/waku-zero/eth/p2p/rlpx_protocols/waku_protocol.nim#L56-L57
I get the following crash:
[Suite] Waku connections
INF 2019-11-15 17:01:52+08:00 RLPx listener up tid=10758 self=enode://23cb2baaab4231bc85271e540c6161549199b78c7f66c2997ca7f0722650a8fcf5e005ff8b34043959bd5e8e0c27a1b868c8a870282a0f8bc18d5c1617084225@127.0.0.1:30304
Traceback (most recent call last)
/home/oskarth/git/nim-eth/tests/p2p/test_waku_connect.nim(26) test_waku_connect
/home/oskarth/.nimble/pkgs/chronos-2.3.2/chronos/asyncloop.nim(933) waitFor
/home/oskarth/.nimble/pkgs/chronos-2.3.2/chronos/asyncloop.nim(274) poll
/home/oskarth/git/nim-eth/eth/p2p/rlpx.nim(1029) rlpxConnect
/home/oskarth/git/nim-eth/eth/p2p/rlpx.nim(899) postHelloSteps
/home/oskarth/git/nim-eth/eth/p2p/p2p_protocol_dsl.nim(305) WakuPeerConnected
/home/oskarth/.choosenim/toolchains/nim-0.20.0/lib/system/gc.nim(238) asgnRef
/home/oskarth/.choosenim/toolchains/nim-0.20.0/lib/system/gc.nim(183) incRef
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Error: execution of an external program failed: '/home/oskarth/git/nim-eth/tests/p2p/test_waku_connect -vvv'
I can't find in docs/p2p.md
or https://github.com/ethereum/devp2p/blob/master/rlpx.md that suggest that this should be invalid. Either:
Make version 0 valid, or:
Give a better validation/compile error, and add to docs that this is a requirement
I didn't debug or try to fix this further, but seems like a bug in the P2P DSL.
Additionally, and separately, I noticed that p2p/docs.md
says subprotocols should be 3 letters. In devp2p/rlpx.md
it just says "short ASCII name" as far as I can tell. I can't tell if this a spec constraint and I'm looking in the wrong place, or what's going on there. Didn't look into it too much. Some validation there might be useful too.
There is the possibility that the onPeerDisconnected
call of an PeerObserver
is called when it shouldn't. That is, when the onPeerConnected
never got called.
This is because removePeer
is run in disconnect
, and disconnect can get called already in rlpxConnect
or rlpxAccept`, e.g. https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx.nim#L1347
This issue gets quick-fixed here: 2174a7b
But, it is already the second time that I see a problem because removePeer
should not be run. So perhaps an architectural change would be better.
The eth_protocol
message newBlock
is currently not used, but I noticed several different rlp deserialisation errors from incoming messages, so it looks like something is wrong there. To be investigated...
e.g. errors:
DBG 2019-06-14 13:07:29+02:00 Failed rlp.read tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:470 exception="Int expected, but found a List" msg=NewBlockAnnounce peer=Node[159.100.254.92:30303] rlpData="{\n byte 88\n byte 0\n blob(3) [01D4C0]\n blob(20) [20E94867794DBA030EE287F1406E100D03C84CD3]\n \"\"\n blob(68) [A9059CBB0000000000000000000000007B74C19124A9CA92C6141A2ED5F92130FC2791F200000000000000000000000000000000000000000000007E9A2F86853F440000]\n}"
DBG 2019-06-14 13:07:29+02:00 RlpError, ending dispatchMessages loop topics="rlpx" tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:558 clientId=Geth/v1.8.23-stable-c9427004/linux-amd64/go1.10.4 msg=newBlock peer=Node[159.100.254.92:30303]
DBG 2019-06-14 13:29:37+02:00 Failed rlp.read tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:470 exception="The RLP contains a larger than expected Int value" msg=NewBlockAnnounce peer=Node[195.201.192.23:30304] rlpData="blob(20) [8E71B195D9CC953F46F41AD013BA1147464B621D]"
DBG 2019-06-14 13:29:37+02:00 RlpError, ending dispatchMessages loop topics="rlpx" tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:558 clientId=Geth/v1.8.27-stable-noku-ae734200/linux-amd64/go1.11.4 msg=newBlock peer=Node[195.201.192.23:30304]
DBG 2019-06-14 13:30:36+02:00 Failed rlp.read tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:470 exception="Sequence expected, but the source RLP is not a list." msg=NewBlockAnnounce peer=Node[212.32.246.39:30314] rlpData="blob(9) [23BCCD842D5D4D0C2B]"
DBG 2019-06-14 13:30:36+02:00 RlpError, ending dispatchMessages loop topics="rlpx" tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:558 clientId=Geth/v5.4.3/linux/go1.9.7 msg=newBlock peer=Node[212.32.246.39:30314]
DBG 2019-06-14 12:46:09+02:00 Failed rlp.read tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:470 exception="Fixed-size array expected, but the source RLP contains a blob of different length" msg=NewBlockAnnounce peer=Node[77.223.66.87:30303] rlpData="blob(2) [5208]"
DBG 2019-06-14 12:46:09+02:00 RlpError, ending dispatchMessages loop topics="rlpx" tid=30804 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:558 clientId=Geth/v1.8.25-stable/linux-amd64/go1.11.5 msg=newBlock peer=Node[77.223.66.87:30303]
DBG 2019-06-14 15:13:52+02:00 Failed rlp.read tid=1332 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:469 dataType=NewBlockAnnounce exception="Small number encoded in a non-canonical way" peer=Node[35.158.27.128:30303]
DBG 2019-06-14 15:13:52+02:00 RlpError, ending dispatchMessages loop topics="rlpx" tid=1332 file=../../../.nimble/pkgs/eth-1.0.0/eth/p2p/rlpx.nim:556 clientId=Geth/v1.8.23-stable-c9427004/linux-amd64/go1.10.4 msg=newBlock peer=Node[35.158.27.128:30303]
Transferring the issue description from status-im/nimbus-eth1#229 (comment):
Digging deeper, I finally understand the problem. And we can leave the trie as the last suspect in this case.
The problem involves two or more accounts with the same storageRoot
. To simplify the illustration, I will use only two accounts.
During transaction U
execution, account A
was modified and it has storageRoot=0xABCDEF
During transaction V
execution, account B
was modified and also arrive at storageRoot=0xABCDEF
Later on, during transaction X
, either account A
or B
is modified again and it has storageRoot=0xDEADBEEF
.
Here the problem arise, during transaction Y
execution, account with storageRoot=0xABCDEF
will have corrupted/incomplete trie. The other account that previously have the same storageRoot
already doing damage on the trie if the pruning is turned on.
Perhaps this is more of design issue.
@zah, you know the hexary trie better than me, do you have any idea how to approach this situation? although my little fix works, but conceptually it doesn't remove the problem.
When monitoring nimbus I often noticed a disconnection during rlpxConnect
.
Turns out this is due to a timeout on the eth_protocol
handshake.
Further debugging has shown that this occurs almost always with Parity clients.
In fact, so far, I've never seen a successful connection with a Parity client, and I've also tested this locally by starting one myself and trying to connect to it.
Devp2p hello
packet passes the node its capabilities. Capability is a protocol short name + version.
We should only start a sub-protocol (capability) between peers for the highest common version of that protocol.
Currently all common versions will be started.
As Waku is a different protocol from Whisper, nodes using Waku protocol will not be able to talk directly with nodes talking still Whisper/6 protocol (unless we do some tricks in handshake...but that would be hacky).
This will need to be resolved if the Status app requires to be compatible with older versions. (App pre Waku can send messages to app using Waku protocol).
Most likely solution here is to have enough full nodes in the network that speak both Waku and Whisper/v6 protocol and that those nodes relay the messages between both protocols.
So the possible solution here would be to implement such relay/bridging node.
Acceptance:
Working PoC bridge solution between Whisper/6 and Waku/0
See #27 (comment)
Currently, if one of the subprotocol onPeerConnected
calls fails (exception raised) this will be catched in rlpxConnect
/rlpxAccept
and the peer will get disconnected (UselessPeer
).
Before this happens, all others subprotocol onPeerConnected
calls will be ran though.
However, the onPeerDisconnected
calls are never run. Possibly creating problems if they should have cleaned up certain resources/loops.
This could be added easily, but then also the onPeerDisconnected
call of the failing subprotocol would be run, potentially causing other problems.
In short, best would be to have only onPeerDisconnected
ran for those subprotocols that succesfully had onPeerConnected
ran.
This used to be a potential problem for whisper_protocol
but got solved like this: #62
But I think it could still be a problem for les_protocol
: https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx_protocols/les_protocol.nim#L249
Currently the check of whether to try to connect to more peers happens before the call which tries to connect to an x amount of peers.
A seq of candidate nodes is returned by discovery and then connected to in parallel. This might result in a higher amount of peers. I guess this is also why it is called minPeers
, so technically, there is no problem.
However, there might be use cases where you do not want to go over a maximum defined limit.
The discovery interface is pretty simple as it mostly consists of getters, like getRandomPeers
, getPeerCount
, with v5 smth like getPeersForTopic
. These gettters look like an easy boundary for a separate thread. Even though we aim for the beacon node to be non-blocking, I still think it's a good idea to make discovery independent thread-wise of the rest of the code. After all it is a library designed to be used not only by nimbus.
Things to keep in mind:
Here it is informally explained that in Whisper the origin of a message could be found out by a powerful adversary.
Perhaps the next issue is obvious and known and not the goal Whisper, but I assumed differently so I'll post it here anyhow. Also, I've not actually tested it.
A Whisper envelope contains the expiry
and ttl
fields. Typically in implementations: expiry - ttl = creation timestamp
.
Now I question if a peer can not simply abuse this timestamp to figure out the origin of the envelope.
A peer his direct connections, could have a fairly stable round-trip time (RTT). Just looking at the timestamp could perhaps identify the peer as a sender or not. Especially over multiple messages.
This is however somewhat more complicated:
In the worst case (e.g. messages with less delay added than RTT/2), it is likely that some messages will be able to be linked to a peer you are connected with.
Combine this e.g. with a node monitoring the messages of some most used public channels on Status and you could perhaps link IP-address to the messages & Status names of a user that this node is connected with.
Of course these are public channels but still I believe pseudoanonimity should be a goal even on those channels.
Possible ideas of improving this:
TBI
https://ci.appveyor.com/project/nimbus/nim-eth/builds/22193468/job/9gukl8y540xc2whc#L4691
Hint: C:\projects\nim-eth\tests\p2p\les\test_flow_control.exe [Exec]
[Suite] les flow control
[Suite] running averages
[OK] consistent costs
[OK] randomized averages
[Suite] buffer value calculations
test_flow_control.nim(4) test_flow_control
flow_control.nim(421) tests
flow_control.nim(266) initFlowControl
system.nim(2830) sysFatal
Unhandled exception: over- or underflow [OverflowError]
Unfortunately before #4 Windows tests were not compiling at all (even in nim-eth-p2p before the merging).
Perhaps we should add a check of x connected peers before we start sending Whisper messages at the specified interval.
If a node is only connected to 1 peer, and that peer is malicious, it could decide not to forward the node its messages.
In this case, the malicious peer could monitor with all it's connected peers if any duplicates of the node its messages are arriving. If this does not occur, it could mean that the messages originates from that node. It could of course also just mean, that the TTL is not long enough to get forwarded all the way to this peer. However, with a low amount of nodes (which Status kinda has), and over several messages, the malicious peer could probably deduce this.
This might be a none issue, especially if a node is only for a very limited time (like @ startup) connected to only 1 node.
If we would have #140 working, we could get along by providing different Protocol names for different versions.
E.g.:
p2pProtocol Wakuv1(version = wakuVersion,
...
p2pProtocol Wakuv2(version = differentWakuVersion,
...
node.addCapability(Wakuv1)
node.addCapability(Wakuv2)
However, it would be a nice-to-have if we could arrange that through the provided version
somehow. This is mostly a cosmetic thing though.
A nicer addition would be if we could tag also certain procs
of a protocol to certain versions. So that if they remain the same, they don't have to be duplicated. Like, a newer version of the protocol could default just "inherit" all the procs of the older versions, unless it implements its own.
These are just some quick thoughts right now, not a priority and perhaps it is not worth the effort. One could always try to just isolate as much of the code and share it.
Implement a draft of the Waku mode. This could be something similar as #114, but exact spec has to be still decided/written. Draft spec writing might go in parallel of this implementation as PoC (testing ground).
Draft version would probably still fall under waku/0.
Finalized spec + implementation could be waku/1 or waku/0. This has to be still decided and is not part of this issue.
Requirements:
Acceptance:
Working PoC of Waku mode.
Running
nimble install
nimble test
I expect tests to pass without any additional special action.
Actual result seen below, appears to be some ambiguous function calls in nim-0.20.0 and nimcrypto-0.39.9
> nimble test
Executing task test in /home/oskarth/git/nim-eth/eth.nimble
Running: tests/test_bloom
[Suite] Simple Bloom Filter tests
[OK] incl proc
[OK] int value
Running: tests/test_common
[Suite] rlp encoding
[OK] receipt roundtrip
Running: tests/keyfile/test_keyfile
/home/oskarth/git/nim-eth/eth/keyfile/keyfile.nim(233, 23) Error: ambiguous call; both utils.fromHex(a: string) [declared in /home/oskarth/.nimble/pkgs/nimcrypto-0.3.9/nimcrypto/utils.nim(158, 6)] and strutils.fromHex(s: string) [declared in /home/oskarth/.choosenim/toolchains/nim-0.20.0/lib/pure/strutils.nim(1049, 6)] match for: (string)
stack trace: (most recent call last)
/tmp/nimblecache/nimscriptapi.nim(164, 16)
/home/oskarth/git/nim-eth/eth_31398.nims(103, 18) testTask
/home/oskarth/git/nim-eth/eth_31398.nims(35, 12) runKeyfileTests
/home/oskarth/git/nim-eth/eth_31398.nims(27, 8) runTest
/home/oskarth/.choosenim/toolchains/nim-0.20.0/lib/system/nimscript.nim(242, 7) exec
/home/oskarth/.choosenim/toolchains/nim-0.20.0/lib/system/nimscript.nim(242, 7) Error: unhandled exception: FAILED: nim c -r -d:release -d:chronicles_log_level=ERROR --verbosity:0 --hints:off --warnings:off tests/keyfile/test_keyfile
Error: Exception raised during nimble script execution
Downgrading to 0.9.16 with choosenim 0.9.16.
appears to work.
Existing use case for this is described here:
status-im/specs#60 (comment)
When topic-interest is used, we currently do now allow to switch back to usage of bloom filter and vice versa.
See also comment here: https://github.com/status-im/nim-eth/blob/master/eth/p2p/rlpx_protocols/waku_protocol.nim#L368
This issue has been raised and a new packet type with a StatusOptions
parameter will be added to to specification.
As soon as this is done, this needs to be added to the waku_protocol here to be compliant.
This might include the removal of some of the current packets types.
See also here for the status-go version: status-im/status-go#1853
Not always, but starting the eth2_network_simulation sometimes triggers this:
FAT 2019-06-14 17:25:43+02:00 Fatal exception reached tid=11000 err="/home/deme/repos/nimbus-nbc/vendor/nim-eth/eth/p2p/kademlia.nim(291, 11) `not contains(k.neighboursCallbacks, remote)`
Line 291 in ee27111
Check if rlpx complies with EIP-8 and create unit tests with these test vectors:
https://github.com/ethereum/EIPs/blob/master/EIPS/eip-8.md#test-vectors
This was done already for the discovery part.
I've seen the test_sparse_binary_trie test hanging sporadically on our CI tests.
Specifically on Travis, MacOS AMD64. I can't recall, but it might also occur on Linux.
Running: tests/trie/test_sparse_binary_trie
[Suite] sparse binary trie
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated
eth_types.nim has a bunch of not implemented methods: https://github.com/status-im/nim-eth/blob/master/eth/common/eth_types.nim#L344
Need to go over the list and see what is necessary & implement them.
getBlockBody
and getReceipt
are needed for eth_protocol
.
Others might be needed also for eth rpc.
First off:
I'm not sure if the application the right place to do this. Perhaps this should be arranged system wide as this is typically a kind of restriction that can be variable per system. However, some advice on it should be formulated then.
The concern is that when having a limitation of amount of connected peers (see #107 & #108) combined with certain high timeouts on the handshakes in the rlpx protocol and sub-protocols that one node could easily launch many connects and allow them to timeout hogging up the connections.
This is obviously not completely correct as currently the limit is not enforced by amount of open connections but by amount of actual completed handshakes (and thus fully connected peers, which would be more intensive for a node to pull off). Just mentioning this part, in case we would change that.
However, there is still a higher limit that remains valid which is the open files limit on Unix systems.
First step should probably be to investigate if there are actually locations where there is no timeout at all (other than the TCP timeout): like are these OK?
Basic solution could also be to just lower the timeouts with a decent amount. (For Whisper I actually never used the 10 second timeout)
Implement waku/0 protocol, a Status fork of Whisper implementing the additions/changes done to it.
This will mostly be adding message packets ids, changes to light client (not sure?), notification on send (how does this work now? and why?). Questions should be asked regarding the validity of changes (is everything still used and needed?).
Mailserver related changes are targeted in separate issue: https://github.com/status-im/nim-eth/issues/102
Requirements:
Acceptance: Implement Status go compatible waku/0.
Is that normal that the following is passing even though there are errors?
https://travis-ci.org/status-im/nim-eth/jobs/489580420#L3422-L3431
cc @zah
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30304], Node[127.0.0.1:30305]]" remote=Node[127.0.0.1:30301]
DBG 2019-02-06 15:54:11+00:00 onPeerConnected Whisper thread=15118
DBG 2019-02-06 15:54:11+00:00 onPeerConnected Whisper thread=15118
DBG 2019-02-06 15:54:11+00:00 Whisper peer thread=15118 peer=Node[127.0.0.1:51484] whisperVersion=6
DBG 2019-02-06 15:54:11+00:00 Sending envelopes thread=15118 amount=0
DBG 2019-02-06 15:54:11+00:00 Whisper peer initialized thread=15118
DBG 2019-02-06 15:54:11+00:00 Whisper peer thread=15118 peer=Node[127.0.0.1:30304] whisperVersion=6
DBG 2019-02-06 15:54:11+00:00 Whisper peer initialized thread=15118
DBG 2019-02-06 15:54:11+00:00 Not enough nodes topics="discovery" thread=15118 present=3 requested=10
DBG 2019-02-06 15:54:11+00:00 Not enough nodes topics="discovery" thread=15118 present=3 requested=10
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30301], Node[127.0.0.1:30305]]" remote=Node[127.0.0.1:30304]
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30301], Node[127.0.0.1:30304]]" remote=Node[127.0.0.1:30305]
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30301], Node[127.0.0.1:30304]]" remote=Node[127.0.0.1:30305]
DBG 2019-02-06 15:54:11+00:00 Received neighbours m topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30304], Node[127.0.0.1:30305]]" remote=Node[127.0.0.1:30301]
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30303], Node[127.0.0.1:30304], Node[127.0.0.1:30305]]" remote=Node[127.0.0.1:30301]
DBG 2019-02-06 15:54:11+00:00 Received neighbours topics="discovery" thread=15118 neighbours="@[Node[127.0.0.1:30301], Node[127.0.0.1:30304], Node[127.0.0.1:30305]]" remote=Node[127.0.0.1:30303]
DBG 2019-02-06 15:54:11+00:00 Sending envelopes thread=15118 amount=0
DBG 2019-02-06 15:54:11+00:00 Sending envelopes thread=15118 amount=0
DBG 2019-02-06 15:54:11+00:00 Sending envelopes thread=15118 amount=0
ERR 2019-02-06 15:54:11+00:00 Light node not allowed to post messages thread=15118
[OK] Light node posting
INF 2019-02-06 15:54:11+00:00 RLPx listener up thread=15118 self=enode://5115f6a21cd791f92a684991e984afd7dd7634d00ee7e1166483d74814d06fa457e29112f69671454b2d9974d5cecfe34538354d64b08168bc19c75757bf5a92@127.0.0.1:30307
DBG 2019-02-06 15:54:11+00:00 onPeerConnected Whisper thread=15118
DBG 2019-02-06 15:54:11+00:00 onPeerConnected Whisper thread=15118
DBG 2019-02-06 15:54:11+00:00 Whisper peer thread=15118 peer=Node[127.0.0.1:51491] whisperVersion=6
DBG 2019-02-06 15:54:11+00:00 Exception in rlpxAccept topics="rlpx" thread=15118 err="Two light nodes connected\nAsync traceback:\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) WhisperHandshake_continue\n whisper_protocol.nim(718) WhisperHandshakeIter\n #[\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) postHelloSteps_continue\n rlpx.nim(1243) postHelloStepsIter\n asyncfutures2.nim(325) read\n ]#\nException message: Two light nodes connected\nException type:" stackTrace="test_shh_connect.nim(383) test_shh_connect\nasyncloop.nim(761) waitFor\nasyncloop.nim(238) poll\nasyncmacro2.nim(36) WhisperHandshake_continue\nwhisper_protocol.nim(718) WhisperHandshakeIter\n[[reraised from:\ntest_shh_connect.nim(383) test_shh_connect\nasyncloop.nim(761) waitFor\nasyncloop.nim(238) poll\nasyncmacro2.nim(36) postHelloSteps_continue\nrlpx.nim(1243) postHelloStepsIter\nasyncfutures2.nim(325) read\n]]\n[[reraised from:\ntest_shh_connect.nim(383) test_shh_connect\nasyncloop.nim(761) waitFor\nasyncloop.nim(238) poll\nasyncmacro2.nim(36) rlpxAccept_continue\nrlpx.nim(1413) rlpxAcceptIter\nasyncfutures2.nim(325) read\n]]\n"
DBG 2019-02-06 15:54:11+00:00 Whisper peer thread=15118 peer=Node[127.0.0.1:30307] whisperVersion=6
ERR 2019-02-06 15:54:11+00:00 dispatchMessages failed topics="rlpx" thread=15118 err="Data incomplete!\nAsync traceback:\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) readExactly_continue\n stream.nim(1284) readExactlyIter\n #[\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) recvMsg_continue\n rlpx.nim(405) recvMsgIter\n asyncfutures2.nim(325) read\n ]#\nException message: Data incomplete!\nException type:"
ERR 2019-02-06 15:54:11+00:00 dispatchMessages failed topics="rlpx" thread=15118 err="Data incomplete!\nAsync traceback:\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) readExactly_continue\n stream.nim(1284) readExactlyIter\n #[\n test_shh_connect.nim(383) test_shh_connect\n asyncloop.nim(761) waitFor\n asyncloop.nim(238) poll\n asyncmacro2.nim(36) recvMsg_continue\n rlpx.nim(405) recvMsgIter\n asyncfutures2.nim(325) read\n ]#\nException message: Data incomplete!\nException type:"
[OK] Connect two light nodes
It seems like reqId
procs are wrong. Relevant codegen:
proc getBeaconBlockHeaders_thunk(msgSender: Peer; _1613480: int; data1613482: Rlp) {.
gcsafe, async.} =
var rlp = data1613482
var msg {.noinit.}: getBeaconBlockHeadersObj
let reqId = read(rlp, int)
enterList(rlp)
msg.blockRoot = checkedRlpRead(msgSender, rlp, Eth2Digest)
msg.slot = checkedRlpRead(msgSender, rlp, uint64)
msg.maxHeaders = checkedRlpRead(msgSender, rlp, int)
msg.skipSlots = checkedRlpRead(msgSender, rlp, int)
await(getBeaconBlockHeaders1613476(msgSender, reqId, msg.blockRoot, msg.slot,
msg.maxHeaders, msg.skipSlots))
proc getBeaconBlockHeaders*(msgRecipient: Peer; blockRoot: Eth2Digest; slot: uint64;
maxHeaders: int; skipSlots: int; timeout: int = 10000): Future[
Option[beaconBlockHeadersObj]] {.gcsafe.} =
var writer = initRlpWriter()
const
perProtocolMsgId = 2
let perPeerMsgId = perPeerMsgIdImpl(msgRecipient, BeaconSyncProtocol, 2)
append(writer, perPeerMsgId)
startList(writer, 4)
newFuture result
let reqId = registerRequest(msgRecipient, timeout, result, perPeerMsgId +
1)
append(writer, reqId)
append(writer, blockRoot)
append(writer, slot)
append(writer, maxHeaders)
append(writer, skipSlots)
let msgBytes = finish(writer)
linkSendFailureToReqFuture(sendMsg(msgRecipient, msgBytes), result)
Note startList(writer, 4)
in the requestor, which then appends 5 params including reqId
.
Note enterList(rlp)
in the thunk, which happens after reading reqId
and followed by 4 reads.
Fix on the way.
Currently the proc
s of a p2pProtocol
will allow for parsing rlp that contains more fields than there are parameters. This is good as it allows for EIP-8 test vectors to pass.
However, the other way around does not work. When our protocol adds certain parameters to a proc
in a newer version, then running against an older version will make it fail.
It would be nice if we could provide a fallback to a lower version.
It could perhaps be done by having some optional
parameters in a proc
, and then within that proc
it could be decided how to continue (be compatible or not...).
This would be specifically handy in the handshake for #114.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.