pokt-network / smt Goto Github PK

View Code? Open in Web Editor NEW

11.0 11.0 2.0 606 KB

A Go library that implements a Sparse Merkle Trie for a key-value map.

Home Page: https://pkg.go.dev/github.com/pokt-network/smt

License: Other

Go 97.67% Makefile 2.33%

key-value-store trie sparse-merkle-trie sum-tree sparse-merkle-sum-trie sparse-merkle-tree

smt's Introduction

smt

Overview
Documentation
Tests
Benchmarks
Release Tags
- Tagging a new release
- Push and Release

NOTE: Requires Go 1.20.12+

Overview

This is a Go library that implements a Sparse Merkle Trie for a key-value map. The trie implements the same optimisations specified in the Libra whitepaper, to reduce the number of hash operations required per trie operation to $O(k)$ where $k$ is the number of non-empty elements in the trie. And is implemented in a similar way to the JMT whitepaper, with additional features and proof mechanics.

Documentation

Documentation for the different aspects of this library, the trie, proofs and all its different components can be found in the docs directory.

Tests

To run all tests (excluding benchmarks) run the following command:

make test_all

To test the badger submodule that provides a more fully featured key-value store, run the following command:

make test_badger

Benchmarks

To run the full suite of benchmarks simply run the following command:

make benchmark_all

To view pre-ran results of the entire benchmarking suite see benchmarks

Release Tags

You can tag and publish a new release by following the instructions bellow.

Tagging a new release

For a bug fix:

make tag_bug_fix

For a minor release run:

make tag_minor_release

Push and Release

Then, push the tag to the repository:

git push origin v<release>

Create a release on GitHub with the tag and the release notes here.

smt's People

Contributors

Stargazers

Watchers

Forkers

muxxer zorjak

smt's Issues

[Optimisation] Improve proof compaction, serialisation and portability

Objective

Improve the serializaiblity & portability of the 3 proof types.

Goals

Decrease the size of the compacted proofs
Understand the impact of these changes on proof sizes in regular and compacted forms (i.e. benchmarks)
Create optionality to use compacted or non-compacted proofs for different use cases

Deliverables

Determine whether we need to introduce protobufs (or other encoding schemes) or improve the custom compression.
A benchmark suit of tests on regular and compressed proof sizes
- As objects
- As serialised byte slices
Improved compact forms of the proof types
- Improved serialisation & portability of proofs
Improve documentation on proofs, how they work, and the additions made from this ticket

Non-goals / Non-deliverables

Change how proofs are generated, validated, or used
Make proofs insecure by way of unnecessary changes to their structure

General deliverables

Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
Testing: Add new tests (unit/fuzz/benchmarks) to the test suite.
Makefile: Add new targets to the Makefile to make the new functionality easier to use.
Documentation: Update architectural or development READMEs; use mermaid diagrams where appropriate.

Creator: @h5law
Co-owner: @Olshansk

feat: Add support for `cosmos/ics23` `NonMembershipProof` types

Objective

Add support for the "range based" non-membership proof types used in the cosmos/ics23 library.

Origin Document

cosmos/ics23 issue #152

Goals

Add a new ProveExclusion function that can traverse the tree and looking for the key we aim to prove, once it has found a leaf it should collect the adjecent leaves and produce a proof for them.
- Reference the JMT's implementation
Add a function to convert SparseMerkleProof objects into the relevent ics23.CommitmentProof types
- See: pocket/ibc/store/proofs_ics23.go

Deliverable

SMT proof type conversion function to ics23 proof type
Prove nonmembership in accordance with the ics23 repo

Non-goals / Non-deliverables

Change existing proof logic

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @h5law

[KVStore] Implement RocksDB as a KVStore submodule

Objective

Implement a submodule in the kvstore directory for a RocksDB

Look into IOTA's rocksdb and see whether this can be used. The repo
Look into Pebble by CockroachDB a Go implementation based on RocksDB they use at CockroachDB
Look into grocksdb which is a wrapper in Go for RocksDB being actively maintained
Look into gorocksdb developed by cosmos as their Go wrapper for RocksDB

Origin Document

RocksDB is known for its speed and effiiency, having it not only as a nodestore backing the SMT would improve performance in prod but also a wrapper around RocksDB that exposes extra methods would be highly sought after.

Goals

Find an appropriate library to base the submodule on
Create a submodule and expose an interface for easy use
Ensure it conforms to MapStore and can be used with the SMT
Add documentation

Deliverables

Create a RocksDB submodule with wrapping functions to provide extra capabilities
Add extensive testing
Add documentation
Add runnable examples
Expose a RocksDBKVStore interface that conforms to the MapStore interface
Add documentation on how it can be used with the SMT
Add runnable examples

Non-goals / Non-deliverables

Change the existing MapStore interface
Change any existing submodules
Rewrite RocksDB - use an existing implementation

General deliverables

Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
Testing: Add new tests (unit/fuzz/benchmarks) to the test suite.
Makefile: Add new targets to the Makefile to make the new functionality easier to use.
Documentation: Update architectural or development READMEs; use mermaid diagrams where appropriate.

Creator: @h5law

[Enhancement] Research, Benchmark and Implement `k`-ary trie types

Objective

The SMT currently is a binary trie. This issue aims to introduce the ability to customise the number of child nodes.

In doing so we will introduce new constants that are the optimum number of child nodes for different underlying databases, and add the ability to set the number of child nodes during the creation of an SMT.

The logic to determine the correct path bit should also be exposed and altered for supporting k number of children.

Origin Document

github comment

Goals

Research k-ary trees/tries
Add functionality to customise the number of child nodes

Deliverable

Create k-ary trie benchmarking suite
Add logic supporting k children for an inner node
Benchmark different values of k for different databases
Expose constants according to the optimal k values found for different databases

Non-goals / Non-deliverables

Alter any existing logic outside the scope of supporting k number of children

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @Olshansk

[Enhancement] Integrate the DeepSubTrie for the SM(S)T

Objective

Following the merge of PR #6 the DeepSparseMerkleSubTree type was removed. This issue aims to re-implement/add a new DeepSubTrie, now using the lazy loaded SMT as its base, for it to work with both the SMT and SMST.

The functionality should be close to the original deepsubtree implementation, instead now utilising the lazily loaded SMT.

This would enable batch updates to the trie - given leaves with a shared prefix an entire DST could be inserted at once

Origin Document

Original lazy loading PR

Goals

Integrate the previous deep subtree implementation with the new lazy loaded SMT
Ensure the DeepSubTrie works for both the SMT and SMST
Enable batch updates/deletions

Deliverable

Implement the DeepSubTrie functionality
Add unit tests and runnable examples
Add documentation on the DeepSubTrie and how it works
Add the methods required for batch updates and deletions

Non-goals / Non-deliverables

Break the SM(S)T in terms of backwards compatibility
Change the existing insertion/deletion logic (for a single key-value)

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law

[Testing] Enhance overall testing in the repo

Objective

This sub-issue encompasses the changes needed to improve the overall testing in the repo. This should include:

Improved fuzzing
Benchmarking
Improvements to the current unit testing

Origin Document

Issue #7
This comment

Goals

Improve the fuzzing tests
- Integrate with oss-fuzz for continuous fuzzing and integrate CI fuzzing tooling as appropriate
Add benchmarking tests
- Using numerous different inputs, benchmark individual components and grouped features of the SMT
Improve current unit tests
- Where it is appropriate cleanup, and improve the current test suite

Deliverable

Improve current fuzzing tests to fuzz numerous aspects of the SMT and its functionality
Integrate with oss-fuzz for continuous fuzzing
Add a CI workflow to be used on new PRs related to fuzzing
Add benchmarking tests for components and feature flows of the SMT
Improve current test suite

Non-goals / Non-deliverables

Change any underlying functionality of the SMT

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @h5law

[Enhancement] Add a standardised encoding scheme for the SM(S)T and its proofs

Objective

This issue tracks the addition of a standard encoding scheme to serialise the SM(S)T itself and its proofs.

Origin Document

Encoding proofs is tracked in PR #21 however a more standardised approach to encoding not only the proofs but also the tree itself should be undertaken.

Goals

Decide upon an encoding scheme (gob, json, protobuf, borsh)
Implement the relevant helpers, functions and data types required to serialise and deserialise the tree and proofs deterministically

Deliverable

Deterministic serialisation and deserialisation of proofs, and entire trees

Non-goals / Non-deliverables

Change any underlying logic in the SM(S)T and proof mechanics

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @h5law

[Scalability Design Document] Design document and plan to test SMT scalability

Objective

Put together a design document and a plan to determine the scalability, efficiency and required improvements we need to make to the SMT to make v1 as scalable as possible.

Goals

Establish the limits of our current SMT (read, write, storage)
Determine the limits for leaf sizes to accommodate things like websockets
Explore alternative key-value stores to badgerdb and compare insert/retrieval speeds
Investigate the potential benefits of extending the prefix trie from a binary tree to a k-ary tree
Understand the implications of proof size on data availability costs and estimate our costs

Deliverable

NOTE: The deliverables below are just a starting point

A design document / plan that'll address the following in different steps:

Benchmarking data on the current SMT's insert/retrieval speeds and proof sizes. For example:
- Can we insert 0.5/1/10 million leaves in 10/60/200 seconds?
- What are the retrieval speeds depending on the tree size?
A report on the feasible leaf sizes for different protocols.
- If we plan to support larger unhashed values (e.g. for websockets), is there a limit on the leaf size?
- Do we need to split it into multiple smaller leaves as data is streamed?
Comparative analysis of different key-value stores alongside badgerdb, focusing on insert/retrieval speeds.
- Could/should we add adapters for leveldb, gorocksdb, boltdb or others and rerun the benchmarks above?
Research and consideration if we should look into implementing k-ary support instead of just binary support in this tree?
Detailed breakdown of proof sizes and strategies to map this onto data availability costs.
- How big are proofs for a tree of 0.5/1/10 million leaves?
- How long should our sessions be based on expected volume per servicer per session?
- What will the data availability cost on Celestia or other DA layers?
d-js.github.io/mermaid/) diagrams

Creator: @Olshansk
Co-Owners: @h5law

[KVStore] Improve the `badger` submodule

Objective

Improve the badger submodule to make it more attractive to potential users

Goals

Improve the documentation
Improve the BadgerKVStore interface and wrapper in general
Research other KVStore implementations for desired features

Deliverables

Add runnable examples in the codebase
Add more features to the badger submodule's KVStore
Expose more of Badger v4's own features

Non-goals / Non-deliverables

Break compatibility with the MapStore interface
Disable its usage as a nodestore for the SMT

General deliverables

Comments: Add/update TODOs and comments alongside the source code so it is easier to follow.
Testing: Add new tests (unit/fuzz/benchmarks) to the test suite.
Makefile: Add new targets to the Makefile to make the new functionality easier to use.
Documentation: Update architectural or development READMEs; use mermaid diagrams where appropriate.

Creator: @h5law

[Enhancement] Add Merkle Sum Tree functionality

Objective

Merkle Sum Trees in addition to storing the hashes of nodes, also store a sum. This sum is used in conjuction with the node's hash to produce the hash of its parent nodes. A Merkle Sum Tree allows for the computation of the sum of the entire tree.

Adding the functionality of a merkle sum tree to this SMT library will increase the number of areas in which it can be used and the number of purposes for which it can be used.

The implementation of the SMST (Sparse Merkle Sum Tree) should be in accordance with the following documents:

Origin Document

This pokt-network PR

Goals

Implement a Sparse Merkle Sum Tree
Add unit tests, fuzz tests and benchmarks
Add documentation explaining the SMST and how it works

Deliverable

Implement the Sparse Merkle Sum Tree
Enable the computation of the tree's sum
Enable feature parity with the existing SMT
Add unit testing for the SMST
Add fuzz tests for the SMST
Add benchmarks for the SMST
Integrate tests/fuzzes with existing CI workflows

Non-goals / Non-deliverables

Change any existing logic related to the SMT functionality

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @h5law @Olshansk

[Code Health] Improve the documentation, testing and overall health of the repo

Objective

In its current state, this repository can do with some improvements to help in its maintainability and further development.

This means increasing the amount of documentation, and tests (local and CI), as well as fixing the issues related to fuzzing.

Origin Document

Goals

Improve the documentation in the repo
Improve the testing framework
Fix fuzzing
Add benchmarks

Deliverable

Improve the documentation in the repo
- More descriptive README.md file, with a description of the project as well as examples
- Improve Godoc comments in the source code
- Add any further documentation as is necessary
Improve the testing framework
- Ensure maximum code coverage with unit tests
- Integrate the repo with Codecov (as it previously was)
- Setup CI workflows to test the repo
Fix fuzzing
- Look into oss-fuzz and fix its current implementation into the pokt-network/smt repo
- Integrate fuzzing into the local test suite
- Look into merging this PR which could integrate oss-fuzz into our CI workflows
Add benchmarks
- Add some benchmark tests that can be used for future performance enhancements to compare against

Non-goals / Non-deliverables

Alter any logic relating to the SMT and its functionality outside of the realm of code cleanup

General issue deliverables

Update any relevant README(s)
Add or update any relevant or supporting mermaid diagrams

Testing Methodology

Task specific tests or benchmarks: go test ...
New tests or benchmarks: go test ...
All tests: go test -v

Creator: @h5law
Co-Owners: @h5law