poanetwork / deployment-terraform Goto Github PK
View Code? Open in Web Editor NEWAnsible and Terraform deployment automation of POA clones
License: GNU General Public License v3.0
Ansible and Terraform deployment automation of POA clones
License: GNU General Public License v3.0
Current version of the playbooks assumes that host is up and client can SSH into. This is not always the case in the cloud.
Terraform starts the virtual machine. It takes some time (20-60s) until the running vm exposes SSH port. Ansible should wait until the SSH port is available.
To achieve this I am going to add first task to each major role that wait_for
the port 22 to be open.
Playbooks will install python even if it is already installed if ansible_python_interpreter
is not default.
To fix this we should replace
deployment-terraform/azure/site.yml
Line 43 in 427ac00
raw: "test -e {{ ansible_python_interpreter | default ('/usr/bin/python') }} || (sudo apt -y update && sudo apt install -y python-minimal)"
Refactor provisioning scripts to support two deployment scenarios for the end user.
# main.tf:
module "poanetwork" {
source = "./poanetwork"
# configuration
platform = "centos"
network = "sokol"
...
}
# Download module
$ terraform init
# Login to Azure
$ az login
# Deploy
$ terraform apply
This scenario deploys typical configuration (for example, bootnode
) and prints it IP address to the user.
Each of the submodules may be deployed separately but from a single Terraform file.
# main.tf:
module "explorer" {
source = "./poanetwork/modules/explorer"
# configuration
platform = "centos"
network = "sokol"
...
}
This scenario makes possible such configuration as:
explorer
nodes, one for sokol
and one for core
networkvalidator
node grabs IP adress of the netstat
node that is provisioned in the same filevalidator
nodes, one running on centos
and one on ubuntu
Roughly speaking it allows construction of the cloud infrastructure from reusable bricks.
Before upgrading to latest versions of parity (poanetwork/deployment-playbooks#206)
it is necessary to configure balancer to accept 405 status code as OK when doing health checks.
Is there are a way to efficiently redeploy the consensus contracts? I've been looking through the roles for a subset that might restart parity and re-clone the consensus contracts repository before deployment. I'm interested in this because I'm making changes to the consensus contracts and currently the iteration process requires a complete reboot of the terraform instances which takes about an hour.
We have a lot of dependencies to install locally before deployment playbooks are ready to run.
Looks like it's better to create jumbox host with terraform, and install everything there (pip and npm them self, all their packages etc) to avoid dependency hell and necessity to support any OS, user can have locally.
Another option I can see here, is to create and provide container with all dependencies ready.
Rename the resources with the default
name - misleading
To not interfere with ceremony please make SPEC_ADDRESS
https://github.com/poanetwork/deployment-terraform/blob/master/azure/group_vars/all.yml.example#L10
reference jinja2-template (like https://github.com/poanetwork/deployment-terraform/blob/master/azure/roles/moc-preconf/templates/spec.json.j2) instead of json
Related to #50. From CI output:
TASK [storage-account : Create an account] *************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Do you have msrestazure installed? Try `pip install msrestazure`- No module named msrest.serialization"}
We will need an additional python
module to be installed to run a new version of delpoyment-terraform
scripts.
Due to recent updates playbooks should explicitly install the solc (npm install [email protected]) before calling the bytecode generating script.
Otherwise, script will throw an error.
(node:8515) UnhandledPromiseRejectionWarning: AssertionError [ERR_ASSERTION]: Invalid callback specified.
at wrapCallback (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:16:5)
at runWithReadCallback (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:37:42)
at compileStandard (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:78:14)
at Object.compileStandardWrapper (/home/poausr/poa-network-consensus-contracts/scripts/node_modules/solc/wrapper.js:85:14)
at main (/home/poausr/poa-network-consensus-contracts/scripts/poa-bytecode.js:10:24)
at Object.<anonymous> (/home/poausr/poa-network-consensus-contracts/scripts/poa-bytecode.js:6:1)
at Module._compile (module.js:653:30)
at Object.Module._extensions..js (module.js:664:10)
at Module.load (module.js:566:32)
at tryModuleLoad (module.js:506:12)
at Function.Module._load (module.js:498:3)
at Function.Module.runMain (module.js:694:10)
at startup (bootstrap_node.js:204:16)
at bootstrap_node.js:625:3
Workaround: install solc on moc manually and rerun playbooks.
Sometimes user may want only to create additional keys for validators. As for now - scripts will launch a lingering procedure of MoC deployment. The aim of this issue is to refactor MoC role and separate creation of validator keys part.
Ansible 2.6 is pretty new, but all development was under 2.4 and 2.5. 2.6 should be tested to avoid bugs (looks like no fafactoring needed, just regression testing).
I want this issue to be a kind of todo-list for the things we need to cover to automate the ceremony and open questions we might have.
Full instruction is available at https://github.com/poanetwork/wiki/wiki/Master-of-Ceremony-Setup
poa-bytecode.js
script, using MoC key as parameterspec.json
, things to update:
netstats secret
)bootnodes.txt
bootnodes.txt
, update local copies of bootnodes.txt on the nodesManipulation on eth keys (creating, converting keystore files <-> private keys, etc) can be done using keythereum
Current version uses static hosts file generated from the Terraform template. The better approach is to use dynamic script returning instances data from the Azure API. The script is documented at Ansible and Microsoft sites.
Each instance has tags like
- env: sokol
role: bootnode
Azure inventory allows one to reference groups of instances based on these tags in the Ansible playbooks.
or should the script create the resource group and then all the resources inside of it?
Currently the first scenario is implemented but I would like to hear your thoughts which way is the best.
Ubuntu images used in Terraform scripts have specific major version (ubuntu 16.04
) and latest
version of the image. That means all security updates applied to the images by its maintainer will be installed during poa deployment. Ansible does not need to run upgrade
by itself just rely on a newer version of the image.
Currently it takes up to 8 minutes to run the update system
task.
Terrafrom can use local storage for state and create container in Azure, and then migrate this state to container. But playbooks and modules strcuture should be considered.
Update README to describe in more details:
PARITY_BIN_LOC
, PARITY_BIN_SHA256
)Due to incorrect inheritance Terraform module is not receiving the admin_username
as a parameter and using always the default value.
After network is created basic tests to check network status should be run (https://github.com/poanetwork/deployment-terraform/tree/master/helper-scripts/network-tests)
run_tests: true/false
in terraform's group_vars/all.yml
that determines if tests should be run or not after network is createdrun_tests: true
. CI should fail if tests failcontracts/contracts.json
file from moc node, however another property should be added to that json file: "POA_ADDRESS"
== address of safeContract from spec.json
config.toml
use moc's json keystore file and password, set url
to http://<balancer_ip>
This bug is connected with syntax error in backend selecting function. This:
- backend|bool == "true" and plan_result.stat.exists != "true"
Should be changed to
- backend | bool == true
Due to compare error local backend is used for main infra instead of remote.
Will be fixed in upcoming PR.
Error ouput
Backend reinitialization required. Please run "terraform init".
Reason: Initial configuration of the requested backend "azurerm"
Resource group role should have a the following in main.yml
prepare_resource_group: true
or
prepare_resource_group: false
Due to the not proper script call bug script does not generate secrets properly. So, this =>
MOC_SECRET: "{{ moc_secret }}"
CERT_SECRET: "{{ cert_secret }}"
NETSTAT_SECRET: "{{ netstat_secret }}"
MASTER_OF_CEREMONY: "{{ MOC_ADDRESS }}"
should be changed to
MOC_SECRET: "{{ moc_secret | default ('') }}"
CERT_SECRET: "{{ cert_secret | default ('') }}"
NETSTAT_SECRET: "{{ netstat_secret | default ('') }}"
MASTER_OF_CEREMONY: "{{ MOC_ADDRESS | default ('') }}"
For now we have two variables - create_resource_group
and resource_group_name
. It complicates the configuration process. It might be a good idea to remove create_resource_group
variable and refactor playbooks, so it will check not the boolean create_resource_group
variable, but an emptiness of resource_group_name
variable. If resource_group_name
is empty - it should be generated.
Please make vm_size
and resource group name configurable variables.
Is it possible to create and manage multiple deployments under the same azure account? After I've created the first deployment terraform plan
returns No changes. Infrastructure is up-to-date.
If yes, which parameter is used to separate them, could it also be made configurable?
I was testing out the deployment-terraform with a simple 2 bootnode 3 validator setup. The preconf and build stages succeed in the CI, but the test stage fails.
Here is the log: https://circleci.com/gh/C4Coin/deployment-terraform/177
I'm not sure how to go about debugging this yet and any guidance would be appreciated.
Current provisioning code organization does not allow clear separation of node types. Terraform has feature called modules that simplifies and structures the project.
Nothing is deployed by default. User adds and configures modules she needs.
common
module provisions shared infrastructure like network. User may use it or prefer to do it manually from console or use already existing infrastructure.
User adds modules based on which kind of instances she wants to deploy.
Each module may be configured separately so it is trivial to set different sizes on different types of nodes. Multiple modules may share some configuration like the type of the network they are connecting to.
Adding new node type will be just adding a new module.
Currently deployment automation uses a separate copy of deployment-playbooks
as the original version is focused on AWS cloud. There is also the requirement the user can run playbooks without Terraform.
Maintaining two copies will require manual sync between them and is not convenient.
@igorbarinov @phahulin what are your thoughts about this issue?
result: Terraform configures instances using code from terraform-integration
branch from poanetwork/deployment-playbooks
repository
terraform-integration
from master
in poanetwork/deployment-playbooks
playbooks
directory to the terraform-integration
branchpoanetwork/deployment-playbooks
repoplaybooks
firectoryAs for #26 MoC-preconf role logic is based on the number of shell
tasks. Ansible is not designed to be a language with a strong logic that allows sequent number of check. I suppose it would be a better idea to create some single script (bash, node.js, etc.) that will be used to create all the necessary variables for the network.
Please clarify about ssh keys usage for a case of non-standard keypair name (not id_rsa
). Is setting ansible_ssh_private_key_file
, ssh_public_key
and ssh_public_key_ansible
in terraform.tfvars
enough?
Also, there are some defaults that look like tests, e.g.
https://github.com/poanetwork/deployment-terraform/blob/master/terraform/azure/variables.tf#L8
After ceremony is complete, deploy DApps:
At the present moment dapps are not very automation-friendly, so this should be on hold for some time.
There is a bug, when the user must explicitly specify when he would like to use remote backend. The reason for that in wrong conditionals, that are set for choosing backends.
Current:
when backend == "true"
Correct:
when backend|bool == true
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.