reines / oversim Goto Github PK

Open-source overlay and peer-to-peer network simulation framework for the OMNeT++ simulation environment

Python 1.23% Shell 0.02% C++ 94.26% C 3.97% Ruby 0.38% Objective-C 0.13%

oversim's Introduction

OverSim - A Flexible Overlay Network Simulation Framework

OverSim is an open-source overlay network simulation framework for the
OMNeT++/OMNEST simulation environment. The simulator contains several
models for structured (e.g. Chord) and unstructured (e.g. GIA)
peer-to-peer protocols. OverSim is developed at the
Karlsruhe Institut of Technology (KIT), Institute of Telematics.

Please look into the doc/ subdirectory where you'll find installation
instructions, the license and the API documentation.

If you need help, please visit the project website: http://www.oversim.org/

Disclaimer: OverSim is continuously being improved: new parts
are added, bugs are corrected, and so on. We cannot assert that any protocol
implemented here will work fully according to the specifications. YOU ARE
RESPONSIBLE YOURSELF TO MAKE SURE THAT THE MODELS YOU USE IN YOUR SIMULATIONS
WORK CORRECTLY, AND YOU'RE GETTING VALID RESULTS.

The OverSim team

oversim's People

Contributors

Stargazers

Watchers

Forkers

zohaib169 lamya1432 trucndt twm1994

oversim's Issues

EpiChord: Original impl. and ours generate lookups in a different way

Maybe the resulting lookup rate is off?

Merge DHT changes

Our Symmetric DHT implementation needs merged. It would be nice to do this in a more tidy way, but I'm not sure how possible that actually is...

EpiChord: findNode should return the successor/predecessor

When returning the best next hops we are meant to also return our direct successor/predecessor (depending on which side of us the query came from).

EpiChord: Should transmit TTL not expire time

Internally expires should be stored as times, but for transmission we should calculate TTL = expire - simTime() and send that (if > 0). When receiving values we should calculate expire = simTime() + TTL, and store this in the cache.

EpiChord: Resulting hop counts don't fit that in the original paper

The hop counts produced by EpiChord don't fit those in the original EpiChord paper, implying part of our implementation is incorrect - this needs fixed otherwise we cannot assume results generated are correct.

When plotting network size vs hop count the lines have a different gradient, with the hop count on our implementation deteriorating as the network size increases.

There are a few areas which come to mind:

~~Slice choices~~ When using lookup-intensive workload the problem is still present, even though the choice of slices isn't used, since the cache already has plenty entries.
Adding nodes to cache
Expiring nodes in cache
Retrieving nodes from cache
Routing using the retrieved nodes

EpiChord: Should attempt to detect false negatives

The original implementation checks for false negative responses during a lookup and alerts nodes if they should re-check their neighbours, resulting in the low periodic neighbour update causing a slightly higher hop count but not affecting the success rate.

Blind-search on Kademlia/NKad/NR128

EpiChord: Original implementation seems to send more in probes

Probes in the original implementation seem to include:

predecessor list.
successor list.
dead nodes within either list.

They also seem to be triggered instantly when an update occurs, pushing that update to neighbours - instead of waiting for neighbours to request a refresh.

EpiChord: Compact probes

Compact probes should be used unless we know of any changes, in which case we send a full probe.

Epichord parallelism

EpiChord: Check bandwidth calculations

First we should check that message sizes are calculated and recorded correctly. We should then check that the bandwidth used matches that in the paper.

EpiChord: merge = true performs worse than merge = false

I think this is down to how nodes are sorted.

When merge = true nodes are sorted based on their distance from the destination key. This means potentially the closest node might be before the key, but the actual node we want to contact first is the one closest after the key - since data is stored at the successor.

Only count min hops for successful lookups

When performing a lookup seem to include failed lookups when calculating both the accumulated and min hop counts. We should probably only calculate the min hops over successful lookups.

Removed non-implemented akad/dkad

EpiChord: findNode should return predecessors not successors

EpiChord states that findNode should return the immediate successor of the key, then p-1 predecessors. This results in a much worse success rate than p-1 successors, but should be done otherwise we aren't properly modelling EpiChord...

OverlayKey / not working correctly for large values?

Example:

2b4c7f7f8fb4e726902528c4e2c8d3b89ee51af6467cef76117c9581c490995623e11fa7f7bd5821e1e7fff904835de7e3db0fff721f97b3a19392251ada4778 / 007fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff = 00000000000000002b4c7f7f8fb4e726bb71a844727dbadf5a56c33ab8faaa556bd358bc7d8b43ab8fb4786475489bcd719c785d79cbf9b55577885cebeb9168

http://www.ontko.com/pub/rayo/gmp/gmp_9.html

DHT: Symmetric + RepeatedHashing maintenance

The maintenance in the DHT depends on knowing which node is responsible for a key. This probably needs modified to work correctly with Symmetric and RepeatedHashing replicaiton.

EpiChord: Node lifetime estimation causes poor success rate

When node lifetime estimation is enabled the estimates even after being multiplied by the modifier are generally higher than 60s, which cause a poor success rate.

In Chord nodes check their predecessor every 5s but only stabilize every 20s, and check fingers every 120s. Doing both stabilization and checking cache every 60s as suggested in the EpiChord paper seems rather odd.

EpiChord: Active propagation doesn't seem to work properly

When using active propagation the success rate actually drops. This needs fixed obviously.

Chord+EpiChord: Success rate drops under churn

The success rate of Chord and EpiChord is hurt quite badly by churn - worse than it should be I think.

With 600s lifetime mean Chord is seeing a lookup success rate of 40-50%. EpiChord is getting 80-90%. Does this imply the redundant nodes being provided aren't being used correctly, or are there simply not enough to handle this level of churn?

Merge blind-search

Tidy up and merge blind-search branch into master. It currently has a different commit history so will need done manually.

EpiChord: Doesn't currently work with merge = true, but parallel paths require merge = true

The EpiChord implementation is based on Chord, which does not work when merge = true. For parallel paths to be enabled, merge must = true. We need to figure out why EpiChord does not support merge = true, and fix/enable it.

EpiChord: Cache replacement lookup queries wrong key

When a slice is found that requires more nodes, according to the paper a request should be sent to the midpoint of the slice.

Since a lookup returns only nodes proceeding the requested key, this means it is impossible for a node in the second half of the slice to be found, and hence we end up with half the slice empty.

I have currently solved this by looking up at the end of the slice rather than the midpoint, however this isn't a good solution either as, assuming the slice has a large number of nodes in it, only the last few will be found. Ideally we want nodes spread evenly throughout the slice, but the paper doesn't seem to address this. Perhaps we could use a custom lookup message that would return X nodes centred around the key?

Blind-search on EpiChord

Replication teams need tested

EpiChord: Check original codes churn distribution

The original EpiChord paper claims their simulations were performed with exponentially distributed lifetime mean, however the code doesn't look like it does.

node_lifespan = (long) (((double) Global.action_interval) * Global.base_node_lifespan
            + parent.rng.nextDouble() * ((double) Global.node_lifespan));

Blind-search response routing

Responses in blind-search should have the option of direct or source response routing.

EpiChord: p>1 causes huge number of maintenance messages

When p>1 a huge number of maintenance messages are generated. For a network of 600 nodes we were generating 72 maintenance messages/second/node. When p=1 this is 0.3 maintenance messages/second/node.

EpiChord: Cache invariant needs calculated

The amount of nodes required in a cache slice should be calculated based on a parameter j and an estimate of the likely-hood of existing nodes being dead. Currently we require numSiblings, which is wrong!

EpiChord: The original code seems to dynamically calculate stabilization interval

I didn't see it in the original paper, but the code seems to dynamically calculate the average node lifetime based on observations from the cache, and use this to set the successor/predecessor stabilization interval.

H

Blind-search dynamic branching

Kademlia managed connections

Open X per bucket
Open newest/oldest/lowest RTT/busiest