Code Monkey home page Code Monkey logo

1k-validators-be's Introduction

CircleCI

Thousand Validators Program Backend

๐Ÿ‘‹ Introduction

The Thousand Validators Programme is an initiative by Web3 Foundation and Parity Technologies to use the funds held by both organizations to nominate validators in the community.

How it Works

The nominating backend will routinely change its nominations at every era. The backend does this by short-listing candidates by validity and then sorts validators by their weighted score in descending order. Validators with a higher weighted score are selected for any possible slots. As validators are nominated and actively validate, their weighted scores decrease, allowing other validators to be selected in subsequent rounds of assessment. If a validator is active during a single nomination period (the time after a new nomination and before the next one) and does not break any of the requirements, it will have its rank increased by 1. Validators with higher rank have performed well within the program for a longer period of time. The backend nominates as many validators as it reasonably can in such a manner to allow each nominee an opportunity to be elected into the active set.

How to Apply

This Repo

A monorepo containing TypeScript microservices for the Thousand Validators Program.

The following is a monorepo of packages for the Thousand Validators Program. Each package is a microservice that can be run independently or together with other microservices.

The monorepo is managed using Yarn workspaces, and contains the following packages:

  • packages/common: A package containing common code shared across all microservices.
  • packages/core: A package containing the core logic of the Thousand Validators Program.
  • packages/gateway: A package for an API gateway that exposes the backend with a REST API.
  • packages/telemetry: A package for a telemetry client that monitors uptime
  • packages/worker: A packages for job queue workers that perform background tasks.

Installation & Setup

Instances

There's a few ways of running the backend with docker containers, either in kubernetes, or with docker-compose.

There is the Current / Monolith way of running instances, and the Microservice way of running instances.

Current / Monolith Architecture:

Current / Monolith Architecture

Microservice Architecture:

Microservice Architecture

The following are different ways of running in either Current or Microservice architecture with either Kusama or Polkadot, and either Development or Production:

  • Kusama Current
    • Running as a monolith with production values
  • Polkadot Current
    • Running as a monolith with production values
  • Kusama Microservice
    • Running as microservices with production values
  • Polkadot Microservice
    • Running as microservices with production values
  • Polkadot Current Dev
    • Running as a monolith with development values
  • Kusama Current Dev
    • Running as a monolith with development values
  • Kusama Microservice Dev
    • Running as microservices with development values
  • Polkadot Microservice Dev
    • Running as microservices with development values

Each package contains a Dockerfile, which is used for running in production, and Dockerfile-dev, which is used for development. The development images will use run with nodemon so that each time files is saved/changed it will rebuild the image and restart the container. Any changes for the regular run Dockerfile will need a manual rebuilding of the docker image.

The difference of running as either Current or Microservice is in which docker containers get run with docker-compose (Microservices have services separated out as their own containers, and additionally rely on Redis for messages queues). Outside of this everything else (whether it's run as a Kusama or Polkadot instance) is determined by the JSON configuration files that get generated.

Cloning the Repository

git clone https://github.com/w3f/1k-validators-be.git
cd 1k-validators-be

Installing System Dependencies

Ensure the following are installed on your machine:

Yarn Installation & Docker Scripts (All in One)

The following are scripts that can be run with yarn that will run all the required installations, config generations and build and run the docker containers. If these are not used you will need to do each separately in the following sections. First run:

yarn install Kusama Current / Monolith Production:

yarn docker:kusama-current:start

Kusama Current / Monolith Dev:

yarn docker:kusama-current-dev:start

Polkadot Current / Monolith Production:

yarn docker:polkadot-current:start

Polkadot Current / Monolith Dev:

yarn docker:polkadot-current-dev:start

Kusama Microservice Production:

yarn docker:kusama-microscervice:start

Kusama Microservice Dev:

yarn docker:kusama-microservice-dev:start

Polkadot Microservice Production:

yarn docker:polkadot-current:start

Polkadot Microservice Dev:

yarn docker:polkadot-current-dev:start

Install Yarn Dependencies

yarn install

Building Node Packages

yarn build

Creating Configuration Files

Before running the microservices with docker-compose, you must create configuration files for each service to be run.

Kusama Current Config: This will create a configuration file for a Kusama instance that mirrors what is currently deployed. This runs the core service within one container, and includes the gateway and telemetry client, all run in the same node.js process.

yarn create-config-kusama-current

Polkadot Current Config: This will create a configuration file for a Kusama instance that mirrors what is currently deployed. This runs the core service within one container, and includes the gateway and telemetry client, all run in the same node.js process.

yarn create-config-polkadot-current

Kusama Microservice Config: This will create configuration files for a Kusama instance for each microservice that runs with production values. This runs core, gateway, telemetry, and worker as separate processes in their own container - each one needs it's own configuration file.

yarn create-config-kusama-microservice

Polkadot Microservice Config: This will create configuration files for a Polkadot instance for each microservice that runs with production values. This runs core, gateway, telemetry, and worker as separate processes in their own container - each one needs it's own configuration file.

yarn create-config-polkadot-microservice

Running the Microservices

Running Kusama Current or Polkadot Current:

Either is from the same docker-compose.current.yml file, and runs only the core container, mongo container, and mongo-express container.

Build and run as detached daemon:

docker compose -f docker-compose.current.yml up -d --build

Running Kusama Microservice or Polkadot Microservice:

Either is from the same docker-compose.microservice.yml file. This runs core, gateway, telemetry, and worker as separate processes in their own container - each one needs it's own configuration file. It additionally runs a redis, mongo, and mongo-express container.

Build and run as detached daemon:

docker compose -f docker-compose.microservice.yml up -d --build

Running Kusama Current Dev, Polkadot Current Dev, Kusama Microservice Dev, or Polkadot Microservice Dev

Either is from the same docker-compose.yml file.

Build and run as detached daemon:

docker compose -f docker-compose.yml up -d --build

Viewing Logs

To view the aggregated logs of all the containers:

yarn docker:logs

or

docker compose logs -f       

To view the logs of an individual service:

Core:

yarn docker:logs:core

or

docker logs 1k-validators-be-1kv-core-1 -f

Gateway:

yarn docker:logs:gateway

or

docker logs 1k-validators-be-1kv-gateway-1 -f   

Telemetry:

yarn docker:logs:telemetry

or

docker logs 1k-validators-be-1kv-telemetry-1 -f  

Worker:

yarn docker:logs:worker

or

docker logs 1k-validators-be-1kv-worker-1 -f  

Stopping Containers

To stop all containers:

yarn docker:stop

or

docker-compose down

Express REST API

When running as a monolith, the Express REST API is exposed on port 3300. When running as microservices, the Express REST API is exposed on port 3301.

You can then query an endpoint like /candidates by going to http://localhost:3300/candidates or http://localhost:3301/candidates in your browser.

Mongo Express (Database GUI)

To view the Mongo Express GUI to interact with the MongoDB Database, go to http://localhost:8888/ in your browser. Or run yarn open:mongo-express from the root directory.

BullMQ Board (Job Queue GUI)

To view the BullMQ Board GUI to interact with the Job Queue, go to http://localhost:3301/bull in your browser if running as microservices. Or run yarn open:bull from the root directory.

๐Ÿ“ Contribute

๐Ÿ’ก Help

1k-validators-be's People

Contributors

alexw3f avatar arthurhoeke avatar benjwi avatar benwhitejam avatar ccris02 avatar ddozen avatar dependabot-preview[bot] avatar dependabot[bot] avatar fgimenez avatar florianfranzen avatar ironoa avatar itrouble avatar joepetrowski avatar krzysztof-jelski avatar kubaw3f avatar lamafab avatar legendnodes avatar lsaether avatar mathcryptodoc avatar meistermike2 avatar michalisfr avatar mohamedhabas11 avatar mutantcornholio avatar nexus2k avatar noc2 avatar pampatzoglou avatar paradox-tt avatar stakeworld avatar w3fbot avatar wpank avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

1k-validators-be's Issues

Make the nominating states persistent across restarts

The service will be occasionally or routinely restarted in order to add more validators to the configuration. This means that all state that is held in the program will be lost unless it's persisted in the database. Currently, we only persist node data in the database and keep nominator state in memory. We should add additional methods on the database to allow for saving nominator data.

Review

  1. scorekeeper.ts
    nodes = nodes.filter((node: any) => node.offlineAccumulated / WEEK <= 0.02);
    constant variable for the UP_TIME

  2. index.ts & scorekeeper.ts

      const scorekeeperFrequency = Config.global.test? '0 0-59/3 * * * *' : '0 0 0 * * *';
      scorekeeper.begin(scorekeeperFrequency);

the crobJob will re-run for everyday, and the nomination transaction logic in the nominator.ts that could possible fail due to the RPC node stucks / tx fail or something like that.
say the sitation like : we have 5 validators would like to nominate (A,B,C,D,E)
A - Success
B - Fail
C - Success
How could we handle validator B nomination ? (based on the current situation, the validator might need to wait 1 day). Suggest to have the logic to handle that.
Also

      await nominator.nominate(toNominate);
      this.db.newTargets(nominator.address, toNominate); 
  this.db.newTargets(nominator.address, toNominate);   <--- this should only update when nomination successful call.

By making the points more useful, it would be great to design the game like

  1. Basic nomination amount (say 3,000 KSM) at the begining
  2. As the validators' uptime keep consistenly stable say > 99% (increase 5% of the nomination)
  3. if the validator is not that stable at all, we can just reduce the amount of nominations by certain percentage.

More better design would also consider the era points element as well, this can ensure the validator has done some actal works.
The above design would require we have multiple accounts for holding different amounts since we cannot change the amount as we want immediately. so would be like
Basic nomination amount: have 20 addresses contain 3000 KSM
Medium nomination amount: have 20 addresses contain 6000 KSM
and so on.

Add round records and the round server endpoint

Right now we all the data exposed as the /nodes or the /nominators endpoint. One thing that would be helpful is to expose the /rounds endpoint with historical data of nominators and their targets and whether the targets ended up doing good or bad. It should expose this for all prior rounds.

Add maximum accounts per identity

In order to limit validators of a single identity to running only a specified number of validator candidates in the program, we need a new parameter and check to ensure that only a maximum of the same identity (including sub-identities) are registered in the program.

clients aren't being tracked properly when updated

When a client updates they aren't being registered as running the latest version, and the backend requires a restart. I think the node registering logic is missing the field to update the client version.

Create FaultEvent's

Right now validators will accumulate faults, however there aren't may clear indicators as to what those faults were for.

It would be nice to have an endpoint to query with the validator, time, and fault reason.

This can be listed under an individual candidate endpoint ideally.

related: #460

Monitor offline reports

and alert the Riot room when a validator was reported offline. Decide on the tolerance of offline reports before docking the rank of the validator.

Candidate Endpoint Improvements

  • Remove SentryId, SentryOnlineSince, SentryOfflineSince
  • Include an array of InvalidityReasons

For Polkadot:

  • Include Kusama 1kv Address

Docs

Add some documentation to explain how everything works.

Fix ranks incosistently updating

By itself, ranks inconsistently update. As a stopgap, retroactive ranks have been introduced. Retroactive ranks should be removed and regular rank increases should be fixed.

Telemetry connection never reconnects and leads to offline accumulated miscalculation

Transcript from Riot:

@will My validator has been up continuously since 9th May 2am BST but nominations have been inconsistent as of late i.e. on 1 day, then off 1 day, then on 1 day, then off 4 days, then on 3 days, then off since a day ago.
So I've dug through https://github.com/w3f/1k-validators-be, queried the backend URL mentioned above, can see a seemingly errnoneous "offlineAccumulated" value of 76104897 (ms) and then the text in "/invalid" that my node has been offline 1268 minutes this week.
As checkSingleCandidate() imposes a 98% weekly uptime requirement, this would explain why nominations were apparently pulled yesterday, and perhaps some of the other occasions as well.
As there happen to be 18 other nodes who also appear in the in the "/invalid" list, all with 1268 minutes of offline time this week, this would appear to be a problem on the 1K backend side.
i.e.

$ curl -s 'https://otv-backend.w3f.community/invalid'|grep 'has been offline 1268\.' -c
19

I did notice this in the validator logs:
2020-05-14 12:15:40 ?? Disconnected from /dns4/telemetry-backend.w3f.community/tcp/443/x-parity-wss/%2Fsubmit: Sink(Custom { kind: Other, error: B(Custom { kind: Other, error: Io(Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) }) })
2020-05-14 12:22:30 ?? Pre-sealed block for proposal at 2304220. Hash now 0x37bf872e064f3a1523dce3390a50c4a93256697106215d3c860a896ffc436b95, previously 0x4b645e02cc97fab6db108603d2c7bcff2a802405fe939e1```
Also running lsof on the validator process I don't see any connections to the w3f telemetry server, so it looks like once the telemetry connection goes down, polkadot never tries to bring it up again.
shadewolf
@will Obviously it is up to W3F and Parity as to how they nominate their stake but in this instance I would suggest they consider resetting the weekly offline accumulated time of affected validators.
sebytza05
shadewolf: yup, i found too in the validator logs 2020-05-14 13:15:40 โš ๏ธ Disconnected from /dns4/telemetry-backend.w3f.community/tcp/443/x-parity-wss/%2Fsubmit%2F: Sink(Custom { kind: Other, error: B(Custom { kind: Other, error: Io(Os { code: 104, kind: ConnectionReset, message: "Connection reset by peer" }) }) })
offlineSinceand has been offline n minutes this week are so useless right now

Summary

It looks like the problem is as mentioned above, that telemetry is kicking off validators and polkadot does not reconnect.

Track if rewards are getting paid out

We need a tool to track if rewards are being paid out by all validators which are nominated as part of this programme.

We should probably expect validators to handle calling their own payouts so if we detect that a validator does not do this we can give a warning.

TypeError: is not a function when running yarn docker

Not long after running yarn docker, I get the following error:

1kv_1         | (node:28) UnhandledPromiseRejectionWarning: TypeError: this.api.query.staking.activeEra is not a function
1kv_1         |     at ChainData.<anonymous> (/code/src/chaindata.ts:13:52)
1kv_1         |     at Generator.next (<anonymous>)
1kv_1         |     at /code/src/chaindata.ts:8:71
1kv_1         |     at new Promise (<anonymous>)
1kv_1         |     at __awaiter (/code/src/chaindata.ts:4:12)
1kv_1         |     at ChainData.getActiveEraIndex (/code/src/chaindata.ts:12:51)
1kv_1         |     at ScoreKeeper.<anonymous> (/code/src/scorekeeper.ts:198:27)
1kv_1         |     at Generator.next (<anonymous>)
1kv_1         |     at /code/src/scorekeeper.ts:8:71
1kv_1         |     at new Promise (<anonymous>)
1kv_1         | (node:28) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
1kv_1         | (node:28) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

This is with:
node v13.10.1 (npm v6.14.2),
yarn 1.22.0

This is on a machine running ubuntu 19.10 with a fresh clone of this repo, and ranks don't end up increasing.

Strangely I don't get the error while on another machine (also ubuntu 19.10), with similar versions of node and yarn. Not really sure what to make of it.

The ensureUpgrades procedure might be broken

Hi, I am running a validator based on the latest client code which is 0.8.24-5cbc418a-x86_64-linux-gnu right now. However, I keep getting "xxx is not running the latest client cod" message.
I thought the reason is the networkId (i.e., Sentry Node Network ID) of these validators is null. Therefore, the ensureUpgrades process will bypass these validators (

const nodes = await this.db.allNodes();
).
The allCandidates and allNodes should be the same lists now since the sentry node is no longer required. (
async allNodes(): Promise<any[]> {
).

False nominations?

Hi,

In my understanding, only the valid candidates will be nominated by the system. However, some invalid candidates also have been nominated. The following data is retrieved from the '/nominators' endpoint in era 1616 and might be the false nominations. Here is the API to check false nominations. https://onekv.herokuapp.com/falseNominations

[
  {
    "stash": "HhcrzHdB5iBx823XNfBUukjj4TUGzS9oXS8brwLm4ovMuVp",
    "name": "KIRA Staking",
    "elected": false,
    "nominatorAddress": "5C8ZU7zugMubgENdcyiZouHcVYSoeWbF8TpXSdWjStzYbFZW",
    "reason": "KIRA Staking has an identity but is not verified by registrar."
  },
  {
    "stash": "EtJ4HxHYEDvYWRJAdmV4hYpTbGMJCmEgnLC8zAf6u5ZyT7C",
    "name": "WolfEdge-Capital",
    "elected": false,
    "nominatorAddress": "5C8ZU7zugMubgENdcyiZouHcVYSoeWbF8TpXSdWjStzYbFZW",
    "reason": "WolfEdge-Capital does not have an identity set."
  },
  {
    "stash": "Dcw5vVBmon1PCERJXkYLvvMVmAE8xdqytUwNQLE8p1Hm33J",
    "name": "robonomics_team-01",
    "elected": false,
    "nominatorAddress": "5DZN69GLFZbm7cF65QBSHC7Ndeqwgjsq7XptnvYbSHHxe7aa",
    "reason": "robonomics_team-01 has an identity but is not verified by registrar."
  },
  {
    "stash": "J7Z1bxUB7qhxjqT5js6yAkCZoU1VYNxPvTdg9mtyNNbU845",
    "name": "Cube3-KSM-Val1-ValidatorA",
    "elected": false,
    "nominatorAddress": "5DZN69GLFZbm7cF65QBSHC7Ndeqwgjsq7XptnvYbSHHxe7aa",
    "reason": "Cube3-KSM-Val1-ValidatorA does not have an identity set."
  },
  {
    "stash": "CgpV58FSvuzGmfZXfiAQfkdDMVcFtpMq91ahk2zNYZdjdR9",
    "name": "LunaNova-KSM-Val1-ValidatorA",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "LunaNova-KSM-Val1-ValidatorA does not have an identity set."
  },
  {
    "stash": "FrQ4W8Bo6wgXzkaGHLzVFSsfbWWHvqGGNP1YkRmTPSkN17J",
    "name": "otter-sv-validator-1",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "otter-sv-validator-1 offline. Offline since 0."
  },
  {
    "stash": "HRYTEruAjwDD46kkgaTYpGHQC6uea3AkeLJg4iterSmmjo2",
    "name": "Tornado-V1",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "Tornado-V1 offline. Offline since 0."
  },
  {
    "stash": "DAexrmQxJ8TKiqpcU2QSn2QiGppGCpWZkJ9p7Nyhm7DW6nB",
    "name": "liberty-sv-validator-0",
    "elected": false,
    "nominatorAddress": "5GgyyiDPHNKSoE2sWCn5dMuJmAgoXnM4dzrmPmBucakiqPYh",
    "reason": "liberty-sv-validator-0 does not have an identity set."
  }
]

Revise Nominations to Efficiently Distribute Stake

At the moment a lot of the nominator accounts distribute stake unevenly.

_doNominations should be revised so that each nominator accounts will nominate(account_balance / lowest_staked_validator * 1.05) amount of candidates.

Abstract the constraints to be more modular

Right now the backend is pretty specific to the 1k-v use case, however if we abstracted the requirements of the validators into its own constraints.js file and allowed this to be passed in as an option it could allow the backend to be used by other nominator services.

Rank Reform

For a period of every 4 eras on Kusama and every era on Polkadot (ie eraPeriod or floor(currentEra / 4)), we should make historical Rank events, indicating that an address has gone up a rank for the period of time.

It may look something like the following:

(say the current era is 1000)

{
    address: "<address>"
    eraPeriod: 250,
    erasActive: [996, 997, 999],
    newRank: 27
}

In this case it would be easier to keep track of previous events, and also compensate for times when the backend misses the times to increment rank. In this case it can backfill previous missed ranks appropriately by looking at the last rank event.

Fix Docker-Compose setup

Right now the docker-compose setup for testing things locally doesn't quite work. The docker images are a bit out of date.

These should be updated, and also the telemetry frontend should be added as well for double checking things. Perhaps it might also be helpful to include another node or two.

https://github.com/wpank/polkadot-local-network/tree/master/scripts/testing

One thing I've also added to a similar approach in the above repo is having a bunch of scripts to do things to the docker containers. So having some of these might be a good way to test things out as well.

Add endpoint to fetch individual candidates

In creating a details page, it would be nice to have an endpoint to query by address that return only the individual candidate data.

So something like /candidates/<validator_address/.

Update README

The README should be updated with any new information and the differences between the Kusama program and the Polkadot program.

Dockerize the "fast substrate" executable

Currently we use a custom built "fast substrate" in order to do the testing. Ideally we can use a mocked substrate to do the testing, but that's probably a whole project in itself.

The least we can do is dockerize the fast substrate so that tests can reliably run on different architecture and CI.

batch API calls

All api calls should be batched in order to reduce the amount of "over the air" calls we do, as well as reduce the room for async failures and bugs.

Enable better fault detection

Right now faults are not always given for behaviour that should induce a fault. These fault events should be more strictly enforced.

Recover from inconsistent API connection

If the API is inconsistent when the ChronJob goes to endRound or startRound then the transactions will not be made. The script should have a way to recover from an inconsistent API connection.

  • It should detect if the connection is inconsistent.
  • If it's inconsistent it should wait until the API connecting is good before trying to send transactions.
  • It should have reliable monitoring of transactions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.