Code Monkey home page Code Monkey logo

deployment-terraform's Introduction

Deployment Automation

This repository contains ansible+terraform scripts to automate deployment of new networks resembling "POA Network".

Namely, the following operations are performed:

  • Random account is generated for Master of Ceremony (MoC)

  • Bytecode of the Network Consensus Contract is prepared

  • Based on these data, genesis json file is prepared

  • Netstat node is started

  • Several (configurable number) bootnodes are started, bootnodes.txt is exchanged between them.

  • Additionally, some more bootnodes can be started behind a Gateway, forming a publicly accessible RPC endpoint for the network. This endpoint is availble over http, but the user may later assign it a DNS name, generate valid ssl certificates and upload them to the Gateway config, turning this endpoint to https.

  • Explorer node is started.

  • MoC's node is started.

  • Ceremony is performed on the MoC's node, i.e. other consensus contracts are deployed to the network.

  • Several (configurable number) initial keys are generated.

  • Subset (or all) of initial keys are converted into (mining + voting + payout) keys.

  • For a subset (or for all) of converted keys, validator nodes are started.

  • Simple tests can be run against the network: (1) check that txs are getting mined (2) check that all validators mine blocks (only makes sense if validator nodes were started for all mining keys).

  • Artifacts (spec.json, bootnodes.txt, contracts.json, ...) are stored on the MoC's node.

  • hosts file is generated on the user's machine containing ip addresses of all nodes and their respective roles.

Most of the work is done by ansible, but to bring up the infrastructure, ansible calls terraform.

Usage

Currently, only deployment to Azure is supported: Azure deployment README

deployment-terraform's People

Contributors

arseniipetrovich avatar igorbarinov avatar jcdenny avatar lexsys27 avatar natlg avatar phahulin avatar vladimirnovgorodov avatar ykisialiou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

deployment-terraform's Issues

Two copies of deployment playbooks

Currently deployment automation uses a separate copy of deployment-playbooks as the original version is focused on AWS cloud. There is also the requirement the user can run playbooks without Terraform.

Maintaining two copies will require manual sync between them and is not convenient.

@igorbarinov @phahulin what are your thoughts about this issue?

result: Terraform configures instances using code from terraform-integration branch from poanetwork/deployment-playbooks repository

Plan

  • create branch terraform-integration from master in poanetwork/deployment-playbooks
  • add git sub-module tracking this branch
  • rebase changes from the playbooks directory to the terraform-integration branch
  • switch to the poanetwork/deployment-playbooks repo
  • delete playbooks firectory

Combine two variables

For now we have two variables - create_resource_group and resource_group_name. It complicates the configuration process. It might be a good idea to remove create_resource_group variable and refactor playbooks, so it will check not the boolean create_resource_group variable, but an emptiness of resource_group_name variable. If resource_group_name is empty - it should be generated.

Specify the defaults when calling secret-generating script in moc-preconf role

Due to the not proper script call bug script does not generate secrets properly. So, this =>

    MOC_SECRET: "{{ moc_secret }}"
    CERT_SECRET: "{{ cert_secret }}"
    NETSTAT_SECRET: "{{ netstat_secret }}"
    MASTER_OF_CEREMONY: "{{ MOC_ADDRESS }}"

should be changed to

    MOC_SECRET: "{{ moc_secret | default ('') }}"
    CERT_SECRET: "{{ cert_secret | default ('') }}"
    NETSTAT_SECRET: "{{ netstat_secret | default ('') }}"
    MASTER_OF_CEREMONY: "{{ MOC_ADDRESS | default ('')  }}"

Check absible 2.6 compatability

Ansible 2.6 is pretty new, but all development was under 2.4 and 2.5. 2.6 should be tested to avoid bugs (looks like no fafactoring needed, just regression testing).

Incorrect admin username

Due to incorrect inheritance Terraform module is not receiving the admin_username as a parameter and using always the default value.

Automate redeploying consensus contract

Is there are a way to efficiently redeploy the consensus contracts? I've been looking through the roles for a subset that might restart parity and re-clone the consensus contracts repository before deployment. I'm interested in this because I'm making changes to the consensus contracts and currently the iteration process requires a complete reboot of the terraform instances which takes about an hour.

Wrong conditionals

There is a bug, when the user must explicitly specify when he would like to use remote backend. The reason for that in wrong conditionals, that are set for choosing backends.
Current:
when backend == "true"
Correct:
when backend|bool == true

Update docker image

Related to #50. From CI output:

TASK [storage-account : Create an account] *************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Do you have msrestazure installed? Try `pip install msrestazure`- No module named msrest.serialization"}

We will need an additional python module to be installed to run a new version of delpoyment-terraform scripts.

Use dynamic inventory

Current version uses static hosts file generated from the Terraform template. The better approach is to use dynamic script returning instances data from the Azure API. The script is documented at Ansible and Microsoft sites.

Each instance has tags like

- env: sokol
  role: bootnode

Azure inventory allows one to reference groups of instances based on these tags in the Ansible playbooks.

Wait until SSH is ready

Current version of the playbooks assumes that host is up and client can SSH into. This is not always the case in the cloud.

Terraform starts the virtual machine. It takes some time (20-60s) until the running vm exposes SSH port. Ansible should wait until the SSH port is available.

To achieve this I am going to add first task to each major role that wait_for the port 22 to be open.

Add tests for a new network

After network is created basic tests to check network status should be run (https://github.com/poanetwork/deployment-terraform/tree/master/helper-scripts/network-tests)

  1. add a new option run_tests: true/false in terraform's group_vars/all.yml that determines if tests should be run or not after network is created
  2. CI should set run_tests: true. CI should fail if tests fail
  3. tests should be run from the local machine (it can be assumed that nodejs is installed locally - add this to documentation and in docker container)
  4. tests require creating a local copy of contracts/contracts.json file from moc node, however another property should be added to that json file: "POA_ADDRESS" == address of safeContract from spec.json
  5. in config.toml use moc's json keystore file and password, set url to http://<balancer_ip>

Split Terraform files into modules

Current provisioning code organization does not allow clear separation of node types. Terraform has feature called modules that simplifies and structures the project.

Nothing is deployed by default. User adds and configures modules she needs.

common module provisions shared infrastructure like network. User may use it or prefer to do it manually from console or use already existing infrastructure.

User adds modules based on which kind of instances she wants to deploy.

Each module may be configured separately so it is trivial to set different sizes on different types of nodes. Multiple modules may share some configuration like the type of the network they are connecting to.

Adding new node type will be just adding a new module.

High-level overview of the ceremony

I want this issue to be a kind of todo-list for the things we need to cover to automate the ceremony and open questions we might have.
Full instruction is available at https://github.com/poanetwork/wiki/wiki/Master-of-Ceremony-Setup

Ceremony in-brief

  • generate Master of Ceremony (MoC) key
  • generate bytecode for the consensus contract - clone https://github.com/poanetwork/poa-network-consensus-contracts, run poa-bytecode.js script, using MoC key as parameter
  • prepare spec.json, things to update:
    • network name
    • network id
    • MoC address (the one with a large amount of tokens)
    • contract bytecode
    • other... (need to have some kind of template)
  • launch netstats node (requires netstats secret)
  • launch explorer node
  • create a new empty bootnodes.txt
  • launch bootnodes
  • put bootnodes' enodes into bootnodes.txt, update local copies of bootnodes.txt on the nodes
  • launch MoC node
  • on MoC node, deploy secondary contracts of the consensus to the network. This step requires temporal unlocking of MoC node by editing the node.toml file
  • create a server to host dapps (maybe not need this for now)
  • on MoC node generate validators initial keys. These keys need to be stored somewhere.
  • convert initial keys into production keys (mining key, payout key, voting key). For a real network this is done via dapp, however it can be automated via script for now
  • use production keys to launch validator nodes
  • combine several bootnodes into load-balancer (how to deal with ssl certificates here?)
  • do basic tests to check that network is functioning

Input parameters (provided to the terraform script by the user)

  • network name
  • network id
  • template of spec.json (should have a default)
  • required number of bootnodes (defaults to 1)
  • required number of bootnodes to put in load-balancer (defaults to 1)
  • required number of initial keys to generate (defaults to 1)
  • required number of initial keys to convert to production keys (defaults to 1)
  • required number of validator nodes to launch (defaults to 1)
  • netstats secret (optional)

Internal parameters (terraform needs to generate)

  • MoC key
  • netstats secret

Parameters that should be stored in a database

  • MoC key
  • netstats secret (if not provided as input parameter)
  • spec.json
  • bootnodes.txt
  • generated initial keys
  • generated production keys

Useful tools

Manipulation on eth keys (creating, converting keystore files <-> private keys, etc) can be done using keythereum

Setting node parameters

Please make vm_size and resource group name configurable variables.

Is it possible to create and manage multiple deployments under the same azure account? After I've created the first deployment terraform plan returns No changes. Infrastructure is up-to-date. If yes, which parameter is used to separate them, could it also be made configurable?

More detailed README

Update README to describe in more details:

  1. what are the azure authentication options, when to choose which
  2. what are the possible state storage options, when to choose which
  3. mention that user may want to create azure resource group, before running the deployment, and where to set its name
  4. how to use custom spec.json and custom playbook options (e.g. how to use custom parity binary via PARITY_BIN_LOC, PARITY_BIN_SHA256)
  5. how to specify which ssh key to use

jumbox with prerequsites

We have a lot of dependencies to install locally before deployment playbooks are ready to run.
Looks like it's better to create jumbox host with terraform, and install everything there (pip and npm them self, all their packages etc) to avoid dependency hell and necessity to support any OS, user can have locally.

Another option I can see here, is to create and provide container with all dependencies ready.

Skip `update system` task in the the `preconf` role

Ubuntu images used in Terraform scripts have specific major version (ubuntu 16.04) and latest version of the image. That means all security updates applied to the images by its maintainer will be installed during poa deployment. Ansible does not need to run upgrade by itself just rely on a newer version of the image.

Currently it takes up to 8 minutes to run the update system task.

Simplify deployment configuration through module refactoring

Refactor provisioning scripts to support two deployment scenarios for the end user.

Scenario 01: minimal configuration

# main.tf:
module "poanetwork" {
  source = "./poanetwork"

  # configuration
  platform = "centos"
  network = "sokol"
  ...
}

# Download module
$ terraform init

# Login to Azure
$ az login

# Deploy
$ terraform apply

This scenario deploys typical configuration (for example, bootnode) and prints it IP address to the user.

Scenario 02: custom deployment

Each of the submodules may be deployed separately but from a single Terraform file.

# main.tf:
module "explorer" {
  source = "./poanetwork/modules/explorer"

  # configuration
  platform = "centos"
  network = "sokol"
  ...
}

This scenario makes possible such configuration as:

  • two explorer nodes, one for sokol and one for core network
  • validator node grabs IP adress of the netstat node that is provisioned in the same file
  • two validator nodes, one running on centos and one on ubuntu
  • full clone of the POA Network

Roughly speaking it allows construction of the cloud infrastructure from reusable bricks.

Not accurate check for python interpreter existence.

Playbooks will install python even if it is already installed if ansible_python_interpreter is not default.
To fix this we should replace

raw: test -e /usr/bin/python || (sudo apt -y update && sudo apt install -y python-minimal)
to

raw: "test -e {{ ansible_python_interpreter | default ('/usr/bin/python') }} || (sudo apt -y update && sudo apt install -y python-minimal)"

Separate validator keys creation from MoC role

Sometimes user may want only to create additional keys for validators. As for now - scripts will launch a lingering procedure of MoC deployment. The aim of this issue is to refactor MoC role and separate creation of validator keys part.

CI do not delete all resources on "Delete" job

This bug is connected with syntax error in backend selecting function. This:

  - backend|bool == "true" and plan_result.stat.exists != "true"

Should be changed to

 - backend | bool == true

Due to compare error local backend is used for main infra instead of remote.
Will be fixed in upcoming PR.

MoC-preconf role refactor

As for #26 MoC-preconf role logic is based on the number of shell tasks. Ansible is not designed to be a language with a strong logic that allows sequent number of check. I suppose it would be a better idea to create some single script (bash, node.js, etc.) that will be used to create all the necessary variables for the network.

Fix bug with incorrect solc version

Due to recent updates playbooks should explicitly install the solc (npm install [email protected]) before calling the bytecode generating script.
Otherwise, script will throw an error.

(node:8515) UnhandledPromiseRejectionWarning: AssertionError [ERR_ASSERTION]: Invalid callback specified.
    at wrapCallback (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:16:5)
    at runWithReadCallback (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:37:42)
    at compileStandard (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:78:14)
    at Object.compileStandardWrapper (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:85:14)
    at main (/home/poausr/poa-network-consensus-contracts/scripts/poa-bytecode.js:10:24)
    at Object.<anonymous> (/home/poausr/poa-network-consensus-contracts/scripts/poa-bytecode.js:6:1)
    at Module._compile (module.js:653:30)
    at Object.Module._extensions..js (module.js:664:10)
    at Module.load (module.js:566:32)
    at tryModuleLoad (module.js:506:12)
    at Function.Module._load (module.js:498:3)
    at Function.Module.runMain (module.js:694:10)
    at startup (bootstrap_node.js:204:16)
    at bootstrap_node.js:625:3

Workaround: install solc on moc manually and rerun playbooks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.