Investigate https://github.com/maidsafe/p

Comments on what I saw: The protocol it's not e

Thanks, useful analysis. Should add my 2¢: D

Yep, also they have a <a href="https://github.com/maidsafe/parsec/blob/d06d325924e71b2

Review maidsafe/parsec about go-lachesis HOT 15 CLOSED

samuelmarks commented on July 17, 2024

Review maidsafe/parsec

from go-lachesis.

Comments (15)

AgustinCB commented on July 17, 2024

Comments on what I saw:

The protocol it's not exactly the same as the one we use currently, but a derivative (https://docs.maidsafe.net/Whitepapers/pdf/PARSEC.pdf). I just skimmed the paper, but I didn't notice an obvious reason to think it's any better and it's fairly young (i.e. untested). I have some reserves about that, but if we aren't in a serious relationship with Hashgraphs, it shouldn't be a blocker. However, it's kinda wasteful to spend two years developing a sound hashgraph library for a project and then use something completely different.
The code it's well tested. For me this is a huge point. It's also tested as I'd expect: A bunch of (real) unit tests, few integration tests, and performance tests. I ran the benchmark and seems to work fine. I'm not sure how to compare the results with the ones in go, so I won't comment on that. Overall, it seems as stable of a project as a six months old project can be. They even do property testing ❤️
They use bincode for serialization. That basically means the clients can only be written in Rust. Not very portable to be fair. It does perform better htan protobuffers, though, although it's less flexible in terms of what you can do (https://github.com/erickt/rust-serialization-benchmarks). Not sure if I should count this as a perk or a problem.
This is completely unnecessary existing libraries such as https://github.com/kriomant/mockers and adds noise.
I'm not sure I understand why this was necessary, instead of HashMap<bool, ()>.
The code is easy to navigate and what I would expect from a well written Rust project. Some technical debt, but nothing abnormal for a project its size. It abuses a bit of return statements in my opinion, but other than that it seems to be idiomatic. I'd be surprised if the project fails because of code quality. I also think that it should be easy to start being productive in such a project from just working knowledge of Rust.
Part of the reason of 6, it's because the project doesn't use multithreading, nor concurrency in any way. That's a very low hanging fruit on how to improve its performance. And a possible source of problems if it turns out their data structures aren't friendly for that kind of paradigm. I can see how moving the network stuff to some kind of event loop would make a difference.
This is ready to use. We could just create a distributed VM with this as a dependency without contributing much.

As is, I see three options:

We fork, and extend with our needs contributing back.
We don't fork, but use as a dependency and start building the VM from it.
We start from scratch.

The project itself has two major drawbacks: Lack of parallelism (we want a shit ton of transactions per second, sooner or later we will need parallelism, plus this is an IO bounded problem, not using parallelism even in a single thread seems like a waste of resources) and lack of stability: It's a 6 months old project implementing a new consensus protocol that as far as I know hasn't been properly tested in the field. Those two drawbacks apply to both option 1 and 2.

Option 1 seems like the worst. We don't have neither the speed of option 2 nor the control of option three and we gain almost nothing. Plus, I'm not really sure there's much for us to do. If the option were up to me I'd discard it the first.

There's a strong argument to make for option 2, as it allow us to not remake the wheel and just jump straight into the VM. If it doesn't work out, we can always remake the library fitting our needs and if it does, we wouldn't loose time. Plus we would jump to the real crux of the project instead of spending too much time in something that feels like bikeshedding. It's probably the most agile option.

Option 3 has two positive arguments in favor: We would be able to add parallelism from the beginning and we can just implement Hashgraphs, an algorithm already tested (and already being used in production applications) and for which we have two years of institutional knowledge. It also has a major drawback: We would be spending another X months reimplementing something yet again. Not the best use of time.

Offtopic: Parsec is a horrible name. For the longest time I thought you were speaking about the well known and long excising parser combinators library https://hackage.haskell.org/package/parsec.

from go-lachesis.

SamuelMarks commented on July 17, 2024

Thanks, useful analysis.

Should add my 2¢:

Dropping hashgraph for something similarly as performant is a good idea. Hashgraph has some patents that we'd like to avoid.

I'm more interested in a fourth option:

We fork, modularise immensely (into new crate repos), offer back*
Then we add new modules inside new crates, and are thus able to keep up with there oft changing repositories

*we can start an issue before we begin, offering this, and including a chart of the new crate dependency graph.

What do you think?

PS: 😗, I know, right? - Haskell's Parsec is stupidly well known, I think even I've used it one time!

from go-lachesis.

AgustinCB commented on July 17, 2024

Ah, didn't know the patent thing. That's a point for parsec, then.

That makes sense. If we do that, I'd start by skimming what we need, parallelize that (as that's now their main drawback) and measure against lachesis. I'm not sure if the project is big enough to modularise more, though, you'd have crates very small and with little value by themselves (this a very early stages project), but we can revisit that as start skimming.

from go-lachesis.

Maxime2 commented on July 17, 2024

First of all we need to understand difference between PARSEC Protocol and OPERA chain we are implementing.

Anyway, forking this project would be the same mistake as forking babble into lachesis - you would be restricted by architectural decisions made by other people likely solving different problems.

The second big question is whether we need to maintain interoperability of Rust and Go implementations, i.e. should Go node be able to work with Rust node?

from go-lachesis.

SamuelMarks commented on July 17, 2024

Yep, also they have a published roadmap.

So, if we're using theirs as a base, we should consider helping them build some of those features.

As for modularising into small crates, my main reason for exploring this avenue is that the parsec code is changing too rapidly for us to easily keep up to date with. However, if it were made of many small, versioned crates, then we could upgrade less regularly. E.g.: every Monday.

Whereas if it were all in one big repository, the chance of stepping on each others toes increases exponentially.

from go-lachesis.

AgustinCB commented on July 17, 2024

If interoperability it's important, the only way to make parsec work for us it's through changing their serialization (or implementing bincode in go, which seems like an awful idea).

Fair enough about small crates. We already have problems stepping into each other foots in the same org, coordinating with another one could be a nightmare.

from go-lachesis.

rishflab commented on July 17, 2024

Imo the problems with the parsec code base is that it has not been designed (architecture, data structures) to exploit multiple cores. There are great and easy to use rust libraries that automatically handle resources, data dependencies and parrallel dispatch of tasks. (https://github.com/slide-rs/shred https://github.com/slide-rs/specs ) which we could take advantage of if we started from scratch.

The main benefit I am seeing is it seems to be well tested but I have no idea how relevant those tests will be once we start changing the consensus protocol.

In the long term, sending structs over UDP will probably best for performance and bandwidth. I actually think we should use something like bincode for the time being and then define our own binary format for UDP to allow go, rust, swift, scala etc. interoperability (protobuf3 over UDP is also a good option). I have used bincode over UDP before and it very easy to get started with and one of the most performant solutions.

from go-lachesis.

SamuelMarks commented on July 17, 2024

Leaning towards Rishab's argument here…

Thank you all for your feedback.

Expect to make a decision here in the next 24 hours.

In preparation, continue this debate, and can you each provide a time estimate for you to personally and exclusively deliver:

Parsec fork with our Lachesis ideas added
New Rust project, implementing Lachesis

🙏

from go-lachesis.

AgustinCB commented on July 17, 2024

I agree with Rishab in the parallelism thing. It's exactly how I feel.

I'm not sure about the UDP point, though. I agree that it performs better, but that only in controlled environment. As soon as you enter an unpredictable environment in which you have to account for congestion control, lost packages or sending packages to the other side of the ocean not the other side of the room, the performance difference becomes mostly irrelevant, in my experience. Every time that in a previous project we thought UDP was a good idea, we ended up having to implement TCP on top of UDP anyway to account for those problems. I wouldn't make a decision based on the possibility of having to use UDP, to be fair. Not only because of the previous mentioned points, but also because first TCP has to become our bottleneck first, which seems veeery long down the road.

As for estimations:

Reimplementing from scratch, would likely take me about three months.
Implementing with parsec... 1.5/2 months, approx.

from go-lachesis.

rishflab commented on July 17, 2024

4 weeks for either option

from go-lachesis.

rishflab commented on July 17, 2024

I'm not sure about the UDP point, though. I agree that it performs better, but that only in controlled environment. As soon as you enter an unpredictable environment in which you have to account for congestion control, lost packages or sending packages to the other side of the ocean not the other side of the room, the performance difference becomes mostly irrelevant, in my experience. Every time that in a previous project we thought UDP was a good idea, we ended up having to implement TCP on top of UDP anyway to account for those problems. I wouldn't make a decision based on the possibility of having to use UDP, to be fair. Not only because of the previous mentioned points, but also because first TCP has to become our bottleneck first, which seems veeery long down the road.

Hmm fair, I dont have much experience with TCP and UDP and from some brief research it seems you are right about reimplementing TCP on top of UDP to handle congestion issues.

Although there is potential for congestion issues, UDP is very easy to use and get started with. You dont have to worry bout reading extracting structs/messages from byte streams. With bincode you can deserialse the incoming UDP packets into a message tagged union/sum type. You can also easily parrellise the processing of the messages with either a par_iter() from rayon or by using multiple ports.

You could build something like this fairly quicklyl. It is simple, easy to understand+debug and can be parellelised to acheive high cpu utilisation.

use bincode::{deserialize, serialize};

#[derive(Serialize, Deserialize, PartialEq, Debug)]
pub enum Message {
    ClientEventCreationRequest{..},
    ParticipantSyncRequest{...},
    SomeOtherMessage{...},
}

fn main() -> std::io::Result<()> {

    let mut socket = UdpSocket::bind("127.0.0.1:34254")?;
    let mut buf = [0; 1024];
    let mut messages = Vec::new();

    loop {
        for _ in 1..1024 {
            match socket.recv_from(&mut buf) {
                Ok((amt, src)) => {
                    let decoded: Message = deserialize(&buf[..]).unwrap()?;
                    messages.push (decoded);
                },
                Err(e) =>{
                    println!("couldn't recieve a datagram: {}", e);
                }
            }
        }
        // Process/validate received messages
        // Do lachesis consensus stuff
        // Send Messages to other participants 
    }
      
    Ok(())
}

from go-lachesis.

SamuelMarks commented on July 17, 2024

I'm a big fan of rayon. Seen many of the author's lectures also.

On the UDP side, don't limit yourself too much. See also: µTP, and this comparison on Wikipedia. The multi-homing features of SCTP seem particularly promising. The feature negotiation specification of DCCP are also rather applicable to our purposes, as we will likely end up with differently versioned nodes.

Thanks for the time estimates. @Maxime2 just need yours, then if everyone can continue discussing the strengths and weaknesses of each approach, so I've got the best chance of proceeding correctly, cum tomorrow afternoon.

from go-lachesis.

Maxime2 commented on July 17, 2024

@SamuelMarks I would say 1-2 months for either option, though it's a very rough estimate.

from go-lachesis.

SamuelMarks commented on July 17, 2024

@andrecronje Thinks we'll ditch maidsafe and start from scratch.

Just his 2¢, tagging him in case he has more thoughts to contribute.

from go-lachesis.

SamuelMarks commented on July 17, 2024

Conclusion: we're building from scratch.

from go-lachesis.

Review maidsafe/parsec about go-lachesis HOT 15 CLOSED

Comments (15)

Related Issues (1)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent