Code Monkey home page Code Monkey logo

skale-admin's Introduction

SKALE Admin

Test Build and publish Discord

This repo contains source code for 3 core SKALE Node containers:

  • skale_admin - worker that manages sChains creation and node rotation
  • skale_api - webserver that provides node API
  • celery - distributed task queue

API reference

SKALE API reference could be found in the docs repo: SKALE Node API.

Development

Run tests locally

  1. Run local ganache, download and deploy SKALE Manager contracts to it
ETH_PRIVATE_KEY=[..] MANAGER_BRANCH=[..] bash ./scripts/deploy_manager.sh
  • ETH_PRIVATE_KEY - it could be any valid Ethereum private key (without 0x prefix!)
  • MANAGER_BRANCH - tag of the SKALE Manager image to use ($MANAGER_BRANCH-latest will be used)
  • SGX_WALLET_TAG - tag of the SGX simulator to use (optional, latest will be used by default)

List of the available SM tags: https://hub.docker.com/r/skalenetwork/skale-manager/tags
List of the available SGX tags: https://hub.docker.com/r/skalenetwork/sgxwalletsim/tags

  1. Run SGX wallet simulator and all tests after it
ETH_PRIVATE_KEY=[...] SCHAIN_TYPE=[...] bash ./scripts/run_tests.sh
  • ETH_PRIVATE_KEY - it could be any valid Ethereum private key (without 0x prefix!)
  • SCHAIN_TYPE - type of the chain for the DKG test (could be test2 - 2 nodes, test4 - 4 nodes, tiny - 16 nodes)

Test build:

export BRANCH=$(git branch | grep -oP "^\*\s+\K\S+$")
export VERSION=$(bash scripts/calculate_version.sh)
bash scripts/build.sh

License

License

All contributions to SKALE Admin are made under the GNU Affero General Public License v3. See LICENSE.

Copyright (C) 2019-Present SKALE Labs.

skale-admin's People

Contributors

alexgex avatar badrogger avatar cstrangedk avatar dependabot-preview[bot] avatar dependabot[bot] avatar dimalit avatar dimastebaev avatar dmytronazarenko avatar dmytrotkk avatar evgeniyzz avatar gannakulikova avatar kladkogex avatar olehnikolaiev avatar yavrsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

skale-admin's Issues

Dry run failed during insufficient funds on schain wallet

Preconditions
Skale manager: 1.8.0-beta.0
Skale admin: 1.1.0-beta.33
Transaction-manager: 1.1.0-beta.7
Step to reproduce
Create schain with not enough funds on schain wallet

Actual result
In case if we have insufficient funds on schain wallet nodes will send reverted tx. This gas cost will reduced from node wallet.
Tx example:
https://rinkeby.etherscan.io/tx/0xc44d83474ad8612493d34345aa82cf6675f543efed35140cb6acac2dfe2937a4

┆Issue is synchronized with this Jira Bug

Misleading wallet info

logger.info('Trying to notify not enough balance...')

It's not notifying that wallet has not enough ETH. Info should be framed as: checking the balance of the node wallet to have enough SKL tokens or ETH

Secondly, This is the node wallet. And why there should be at least 0.1 SKALE tokens in the node wallet?

Admin: Increase reconnect time for SGX

Preconditions
Versions

Step to reproduce
Turn off sgx before schain creation for 10-20 min(not less than 10 min)
Start creating schain

Actual result
Skale admin have only ~600 seconds to reconnect to sgx.
As result for successful dkg we need to restart skale admin manually in 120 min
Otherwise dkg will failed

NOTE: This case related mostly to problem with sgx during dkg process

┆Issue is synchronized with this Jira Task

Concurrent writes during block syncing check

The block syncing check contains two steps:

  • Check if last block timestamp is not older than 30 seconds.
  • Check if the last used block is less than the current one.
    Seconds step contains the problem: There is a place in skale-admin where last block written concurrently to the file.
    We can either disable second step completely. Or fix it

┆Issue is synchronized with this Jira Bug

Schains don't work with a new sgxwallet which was backuped on new url

  1. create 17+ nodes
  2. create medium schains
  3. create backup for first sgxwallet and not run container
  4. copy backup to second sgx server run new sgxwallet from backup
  5. change in .env on node with schains url for sgx wallet from first to second
  6. update node

Expected: node and schains works fine with new sgxwallet url
Actual: schains still looking at the first sgxwallet url and stuck

┆Issue is synchronized with this Jira Bug

skale-api returns incorrect DKG status

Watchdog returns false for DKG, when DKG is completed after n-th rotation. While check for DKG on skale-admin is true.
Watchdog version: 1.1.2-beta.0

[~accountid:5b2c7d78927da916aaaae26b] add admin version please

Watchdog returns:

Node ID        sChain Name         Data directory    DKG    Config file   Volume   Container    IMA    Firewall   RPC    Blocks
-------------------------------------------------------------------------------------------------------------------------------
13        rhythmic-pherkad-minor   True             False   True          True     True        False   True       True   True

┆Issue is synchronized with this Jira Bug

Admin: skaled container didn't restart after SIGABRT

Preconditions
Skale-admin:1.1.0-beta.25
Skaled: 3.4.9-develop.0

Step to reproduce
Spin up schain

Actual result

Admin didn't restart container after SIGABRT
Node: 44.241.162.179
Schain name: loud-gienah-cygni

Skaled log

Failed sChain checks
sChain name: loud-gienah-cygni


# Failed checks: rpc, blocks
[2021-02-10 11:42:05,498 INFO] tools.notifications.messages:119 - ThreadPoolExecutor-0_0 - Saving new checks state 399 [('blocks', False), ('config', True), ('container', True), ('data_dir', True), ('dkg', True), ('exit_code_ok', True), ('firewall_rules', True), ('ima_container', True), ('rpc', None), ('volume', True)]
[2021-02-10 11:42:05,520 INFO] web.models.schain:106 - ThreadPoolExecutor-0_0 - Changing first_run for loud-gienah-cygni to False
[2021-02-10 11:42:05,561 INFO] web.models.schain:116 - ThreadPoolExecutor-0_0 - Changing new_schain for loud-gienah-cygni to False
[2021-02-10 11:42:06,036 INFO] core.schains.creator:197 - ThreadPoolExecutor-0_0 - Running monitor for sChain loud-gienah-cygni in REGULAR mode
[2021-02-10 11:42:51,773 INFO] core.schains.creator:131 - MainThread - Creator procedure finished
[2021-02-10 11:42:51,890 INFO] core.schains.creator:87 - MainThread - Creator process is joined.



┆Issue is synchronized with this [Jira Bug](https://skalelabs.atlassian.net/browse/SKALE-3865)

Remove debug APIs for skaled (Mainnet)

Option --enable-debug-behavior-apis now always present in skaled CMD, it should be removed for the Mainnet build.

NOTE: Probably we should add some flag to the skale-node that will indicate testnet/mainnet/another setup.

┆Issue is synchronized with this Jira Task

Monitor doesn't recreate data dir for schain in RESTART mode

STR:

  1. create 17+ nodes
  2. create schain
  3. run node exit with node that contains 4 schain.
  4. first rotation DKG should fail for any of schain.

Expected: schain eventually should be rotated and keep working.
Actual: node that had schain before second rotation cannot successfully save secret_key because of FileNotFoundError: [Errno 2] No such file or directory: '/skale_node_data/schains/squeaking-shaula/secret_key_2.json'

The problem is that after failed dkg data dir for the schain is removed, but second rotation during monitor in RESTART mode it’s not recreated, so dkg procedure eventually failed.

┆Issue is synchronized with this Jira Bug
┆Attachments: logs.txt

Restructure sChain storage limits

Currently, internal sChain volume limits are calculated in 2 different ways for the different pieces. Consensus, LevelDB, and filestorage limits are calculated dynamically, but the storage limit is pre-set.

Everything should be generated in advance using static file in skale-node (configs.yml) and Python script in helper-scripts. It should generate schain_allocation.yml file with all params (in bytes)

┆Issue is synchronized with this Jira Task

Login command returns strange error if specified user not exists

python main.py user login

Enter username: test
Enter password:
Authorization failed: {"errors": [{"msg": "<Model: User> instance matching query does not exist:\nSQL: SELECT "t1"."id", "t1"."username", "t1"."password", "t1"."token", "t1"."join_date" FROM "user" AS "t1" WHERE (("t1"."username" = ?) AND ("t1"."password" = ?)) LIMIT ? OFFSET ?\nParams: ['test', '098f6bcd4621d373cade4e832627b4f6', 1, 0]"}]}

skaled container is stuck when it has 'created' status

Have an infrequently error when schain create. Sometimes skaled container is stuck because of 'created' status when run schain create

versions:
admin:1.1.0-beta.39
schain:3.5.12-develop.0
node_cli:1.1.0-beta.22

str:
create schain on 16 VMs

┆Issue is synchronized with this Jira Bug

Test admin-skaled interaction using skaled-emulator

1 create schain using skalenetwork/skaled-emulator instead of skaled
2 check skaled statuses: which started successfully which not
3 if some of unstarted skaleds print "FAILURE" in logs - report about it
4 document exit reason from skaled's logs and run repair on unstarted skaleds
5 after all 16 skaleds started - run node update on one (or serveral at once if it's easier) skaleds 10-20 times and check that everything is successfull

┆Issue is synchronized with this Jira Task

Add hardware and geth checks to skale-api

We need to verify that node's machine hardware is met requirements before schain creation. Also it's better to ensure that eth client is running correctly. We need to add corresponding healthchecks to skale-api.

┆Issue is synchronized with this Jira Task

Add additional field in schain config

Need to update schain config to have opportunity see true ETH on schain in metamask.
Config example:
"skaleConfig": {
"nodeInfo": {
"nodeName": "Node1",
"nodeID": 1112,
"bindIP": "127.0.0.1",
"basePort": 1231,
"bindIP6": "::1",
"basePort6": 1231,
"logLevel": "trace",
"logLevelProposal": "trace",
"adminOrigins": [
"*"
],
"ipc": false,
"ipcpath": "./ipcx",
"db-path": "./node",
"httpRpcPort": 15000,
"httpsRpcPort": 15010,
"wsRpcPort": 15020,
"wssRpcPort": 15030,
"httpRpcPort6": 15000,
"httpsRpcPort6": 15010,
"wsRpcPort6": 15040,
"wssRpcPort6": 15050,
"acceptors": 1,
"infoHttpRpcPort": 16000,
"infoHttpsRpcPort": 16010,
"infoWsRpcPort": 16020,
"infoWssRpcPort": 16030,
"infoHttpRpcPort6": 16000,
"infoHttpsRpcPort6": 16010,
"infoWsRpcPort6": 16040,
"infoWssRpcPort6": 16050,
"info-acceptors": 1,

For more info ask [~accountid:5beaf49dc1d1402b40229cd2]

┆Issue is synchronized with this Jira Task

Improve snapshot sending/receiving procedure

The following approach is suggested:

  1. For now leave Large schain type completely out of the scope, because there some other issues that prevents us to create it.
  2. Release first mainnet version without related feature.
  3. For the second mainnet update implement one of the following solutions (going to decide later):

a. Saving data to the temporary space (reserved space inside attached storage) without limiting number of schains that currently downloading snapshots (modifications in both skale node components and skaled).

b. Send and receive snapshots using streams without saving snapshots to any non btrfs file/directory (require only skaled changes).

┆Issue is synchronized with this Jiraserver Task
┆Attachments: 1_ok.txt | 2_ok.txt | 3_ok.txt | 4_fail.txt | snapshot_occeupied_melodic-yildun.log | snapshot_occupied_tinkling-zibal.log

Cleaner didn't handle rotated sChain

sChain was rotated from node due to the failed DKG, cleaner didn't remove it because of this condition check:

if not skale.schains_internal.is_schain_exist(schain_name) or \
            is_exited(schain_name, dutils=dutils):
        logger.info(arguments_list_string(
            {'sChain name': schain_name}, 'Removed sChain found')
        )

┆Issue is synchronized with this Jira Bug

Rotation. Node send complaint to another node without broadcast sending

  1. create 17+ nodes
  2. create schain
  3. run node exit with node that contains 4 schain.
  4. first rotation DKG should fail for the schain.

Expected: schain eventually should be rotated and keep working.
Actual_1: node G send complaint to node B before broadcast transaction on node G

┆Issue is synchronized with this Jira Bug
┆Attachments: node-B-skale-logs-dump-2021-04-15-14_33_20.tar.gz | node-G-skale-logs-dump-2021-04-15-14_38_06.tar.gz

Rotation. Have node_id: -1 - the node did not find itself in the list of nodes for the schain from contract.

  1. create 17+ nodes
  2. create 4 schain
  3. run node exit with node that contains 4 schain.
  4. first rotation DKG should fail for the schain.

Expected: schain eventually should be rotated and keep working.
Actual_1: node A send complaint to node B after removing node A from schain squeaking-shaula group on contract
Actual_1_1: complaint was sended because skale_admin on node A do not see broadcast what was sended from node B
Actual_2: node A truying send broadcast for schain 'squeaking-shaula', despite the fact that this node A is not in the group for this schain on contract
Actual_3: monitor show schain squeaking-shaula on node and not show this schain on contract in skale_admin logs

┆Issue is synchronized with this Jira Bug
┆Attachments: node-A-skale-logs-dump-2021-04-15-14_32_18.tar.gz | node-B-skale-logs-dump-2021-04-15-14_33_20.tar.gz | rotated-node-skale-logs-dump-2021-04-15-14_30_05.tar.gz

Add API to watchdog and skale-api to return only block health check for sChain

Currently watchdog (and skale-api) provides an ability to get only all sChain checks (config, DKG, RPC, blocks, volume, firewall, etc) which consumes lots of resources and time.
Proposal from validator: add an API to retrieve only the latest health check - blocks check, assuming that if blocks are mining and local RPC is available, then sChain is operating normally.

┆Issue is synchronized with this Jira Task

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.