redhat-cop / agnosticd Goto Github PK

AgnosticD - Ansible Deployer for multiple Cloud Deployers

Home Page: https://redhat-cop.github.io/agnosticd/

License: GNU General Public License v3.0

Python 2.09% Shell 4.88% JavaScript 0.01% HTML 0.82% Groovy 0.17% Dockerfile 0.58% Makefile 0.36% Go 0.03% Jinja 91.02% Smarty 0.01% Vim Script 0.01% CSS 0.03%

gpte

agnosticd's People

Contributors

Stargazers

Watchers

Forkers

lpsantil syvanen thoraxe mikecali redhatdemos piggyvenus b3rn jeromemarc benoid vvaldez mglantz hashnao jrjeon dbenoit17 luckenbach fridim jbodswor geraldsp dischord01 ravikiranking ngerasimatos siamaksade erikbsap jruariveiro cbolz kmanav texanraj bennojoy georgegoh jittakal va-vivek manishrockstar cvicens jingxianqiang lanyushi maartinpii slaterx duncandoyle jainankur2 lphiri bentterp viecili bit4man maltron faz59 ccustine openshift-ansible-contrib tonyli71 sabre1041 lshake nanokinetics rgerardi charlesmahoney myriamfentanes ericzolf diego-torres hslee9397 rromannissen josephpisciotta nsabine ihsanmokhlisse frangdlt tqvarnst sampop3 rmaddali michaelpassion pkrish15 mbah2017 nanyte25 rprav007 chrisricci akhan073 rmahroua red-tux hwurht darthlukan ruo91 shashanksr6694 posip-redhat bosebc binnyrs jameelb philbrookes rubbsdecvik techjw maleck13 paulghattas oosthuizenj nmasse-itix mophahr trepel injeti-manohar pmccarthy sedroche jatinderjet epasham kameshsampath knrc ryanj eformat

agnosticd's Issues

stages should have forced playbook names for include

@sborenst and I discussed this earlier. Instead of searching the folder for tasks to include (which can't have a host/groups specified), we should be using a specific post_infra, pre_software, post_deploy set of files where the end user can then have all of the plays they need.

fix generating node names in hosts templates

When building the scaleup playbook (and then the scaleup template for hosts inventory file), we discovered the current way of generating the node names can lead to mismatch:

 ## These are regular nodes
 {% for host in groups[('tag_' + env_type + '-' + guid + '_node') | replace('-', '_') ] %}
node{{loop.index}}.{{chomped_zone_internal_dns}} openshift_hostname=node{{loop.index}}.{{chomped_zone_internal_dns}}  ansible_ssh_user={{remote_user}} ansible_ssh_private_key_file=~/.ssh/{{key_name}}.pem openshift_node_labels="{'logging':'true','cluster': '{{guid}}', 'env':'app', 'zone': '{{hostvars[host]['ec2_placement']}}'}"
 {% endfor %}

Because the way ansible retreives the hosts in the groups[] is not determinist., node{{loop.index}}.{{chomped_zone_internal_dns}} can be different of the actual node internaldns.

Instead of crafting the name with the index, we should directly use [hostvars[host].ec2_tag_internaldns:

{% for host in groups[('tag_' + project_tag + '_node') | replace('-', '_') ] %}
{{ hostvars[host].ec2_tag_internaldns }} openshift_hostname={{ hostvars[host].ec2_tag_internaldns }} ansible_ssh_user={{remote_user}} ansible_ssh_private_key_file=~/.ssh/{{key_name}}.pem openshift_node_labels="{'logging':'true','cluster': '{{guid}}', 'env':'users', 'zone': '{{hostvars[host]['ec2_placement']}}'}"
{% endfor %}

collision errors during simultaneous deployments

Affects Ravello Deployments.

Control Path Collisions, as discussed here:

fatal: [node.rhpds.opentlc.com]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips [date_redacted]
debug1: Reading configuration data /root/ansible_agnostic_deployer/ansible/workdir/ocp-demo-lab_david-an28_ssh_conf
debug1: auto-mux: Trying existing master
debug1: Control socket \"/tmp/node.rhpds.opentlc.com-cloud-user\" does not exist
debug1: Executing proxy command: exec ssh -i /root/ansible_agnostic_deployer/ansible/workdir/david-an28key -W node.rhpds.opentlc.com:22 -q cloud-user@bastionhost-ocpdemolab[id_redacted].srv.ravcloud.com
debug3: timeout: 60000 ms remain after connect
debug1: permanently_set_uid: 0/0 
debug3: Incorrect RSA1 identifier
debug3: Could not load \"/root/ansible_agnostic_deployer/ansible/workdir/david-an28key\" as a RSA1 public key
debug1: identity file /root/ansible_agnostic_deployer/ansible/workdir/david-an28key type 1
debug1: identity file /root/ansible_agnostic_deployer/ansible/workdir/david-an28key-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1
debug1: permanently_drop_suid: 0
Connection timed out during banner exchange
",
    "unreachable": true
}

Bad URL Requests

An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 684, in <module>
    main()
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 310, in main
    create_app_and_publish(client, module)
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 652, in create_app_and_publish
    app = client.create_application(app)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 539, in create_application
    return self.request('POST', '/applications', app)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 359, in request
    response = self._request(method, path, body, headers)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 430, in _request
    response.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 893, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://cloud.ravellosystems.com/api/v1/applications

fatal: [localhost -> localhost]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "invocation": {
        "module_name": "ravello_app"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 684, in <module>\n    main()\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 310, in main\n    create_app_and_publish(client, module)\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 652, in create_app_and_publish\n    app = client.create_application(app)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 539, in create_application\n    return self.request('POST', '/applications', app)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 359, in request\n    response = self._request(method, path, body, headers)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 430, in _request\n    response.raise_for_status()\n  File \"/usr/lib/python2.7/site-packages/requests/models.py\", line 893, in raise_for_status\n    raise HTTPError(http_error_msg, response=self)\nrequests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://cloud.ravellosystems.com/api/v1/applications\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE"
}

use ec2.ini to filter and reduce dynamic inventory based on Stack tag

In some configs, we could update the inventory/ec2.ini file:

regions= ${REGION}
instance_filters=tag:Stack=project ${project_tag}

It will act as a fail-safe and also improve the overall performance of the playbooks

#before
[user@work ansible]$ ./inventory/ec2.py --list|wc -l
17106

#after
[user@work ansible]$ ./inventory/ec2.py --list|wc -l
508

potential future problem -- keep wait_for_ssh?

I'm not sure that this works currently as we think it does.

I'd have to double check with -vvv to see if the SSH is being forced through the proxy. If it's trying to make a direct connection to the inventory hosts then that would work for me but potentially would fail in other environments where only the bastion host has SSH open...

[bu-workshop] add gluster CNS

"name: Configure local .ssh/config for bastion proxy use" seems irrelevant if using ssh_vars

This task seems to be irrelevant now that ssh_vars are used. If anything it's duplicative. What is the purpose of this one?

Better / easier approach to develop and test a lab incrementally

Perhaps we just didn't have enough time to really do it correctly, but it proved quite painful, slow and error prone to develop further a lab, without having access to the control host setting up the lab.
It became a bit easier to test incremental changes done to my branch after I've created a local inventory. There is surely better ways to do this but currently it's uploaded to https://github.com/ericzolf/prepare-dev-env-aad-cicd and we should at least discuss and decide how to better do this in the future.

common role should be split to subs, repos, and then common setup

Ansible with subscription manager is really finicky. As such it would be nice to run those steps in serial. However, with the current common role, if run in serial it takes an extremely long time because both update and install of packages are included.

I think it would be cleaner if we had a subs role, a repos role, and a common role:

subs - use subscription manager to register and attach systems (ignored in some cases)
repos - use either yum repo config or subscription manager to configure repos (depending on config)
common - install common packages and potentially apply updates

This would allow the "subs" role to be serialized to improve performance, and then the other two roles could remain in parallel.

Thoughts?

ocp-workload-bxms-pam too slow to provision

jbride [9:21 AM]
Hi
Appears we have a problem with the following ansible: ocp-workload-bxms-pam . In particular, the pods are taking forever to start up. they are very slow. (edited)
Looks like we are utilizing a stock PAM 7 template and placing the DCs in an initial paused state ..... nice.
Also, there are memory limits defined .... nice.
however, the template does not specify a cpu request for any DC. Subsequently, the cpu request default provided by a LimitRange applied to the OCP project is being used. This LImitRange is utilized by all OCP projects. And, it defines a very low cpu request: 50 millis (ie: 1/20th of a CPU).

So seems that we should specify a cpu request in each of our DCs that is more like 500 millis (or maybe even 1 Gi)

Remove ocp-workshop project

The ocp-workshop project with nexus should not be part of the default playbooks and rather an extra role that one would choose to run

deploy bu_workshop fails at step "Add all hosts to workdir ssh config file"

TASK [Add all hosts to workdir ssh config file] *****************************************************************
fatal: [localhost]: FAILED! => {"failed": true, "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'private_ip_address'\n\nThe error appears to have been in '/home/bent/ansible_agnostic_deployer/ansible/cloud_providers/common_ssh_config_setup.yml': line 63, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Add all hosts to workdir ssh config file\n ^ here\n"}

Provide alternative installation process for AWS CLI in "Preparing your Workstation to use the Ansible Playbooks" document

Provide the following alternative installation process for AWS CLI in "Preparing your Workstation to use the Ansible Playbooks" document:
$ pip install awscli --upgrade --user
$ vi ~/.bash_profile
Add this to the path variable -- PATH=/root/.local/bin:...

post playbook should be specified by env var

If you need post processing, simply providing a single env var to load a single playbook would fix all of our problems.

As the last line in main.yml:

- include: "{{playbook_dir}}/configs/{{env_type}}//{{ post_play }}"

The other option is to simply require a single post deployment play:

- include: "{{playbook_dir}}/configs/{{env_type}}//post_software.yml"

This way the user is free to go hog wild with whatever they want. It also gets us around the difficulty of not being able to dynamically include plays with specific hosts and not being able to specify hosts in task file includes.

automatically create env_secret_vars.yml file

playbooks fails if this files is not present because of vars_file in plays. Can we add a test and create it (empty) automatically if it does not exists?

Add 3scale security demo

Add a demo to show case 3scale integration with RH SSO for securing web apps and APIs

[bu-workshop] need ami mapping for eu-west-2

[RFE] clarification of the host-ocp-installer tasks

https://github.com/sborenst/ansible_agnostic_deployer/blob/development/ansible/roles/host-ocp-installer/tasks/main.yml#L60-L78

This says the same thing twice. If would be more clear if we added something to the task name that suggests one is when using the cloned repo, and the other is using the packaged installer.

AdmissionConfig ImagePolicy is not configured

This configuration needs to be configured in the cluster (ocp-workshop) for proper imagestream resolution to happen on nativee kubernetes objects (https://docs.okd.io/latest/admin_guide/image_policy.html#image-policy-configuring-the-image-policy-admission-plug-in).

Current version (3.10.14) get it's configuration from an environment variable. (https://github.com/sborenst/ansible_agnostic_deployer/blob/master/ansible/configs/ocp-workshop/files/hosts_template.3.10.14.j2#L125-L131)

The default value for the admissionPlugin in that environment variable is not properly set (https://github.com/sborenst/ansible_agnostic_deployer/blob/master/ansible/configs/ocp-workshop/env_vars.yml#L243-L253)

In previous releases (3.9.40) this configuration is commented out (https://github.com/sborenst/ansible_agnostic_deployer/blob/master/ansible/configs/ocp-workshop/files/hosts_template.3.9.40.j2#L120-L123), but applied in other previous releases like 3.9.31 (https://github.com/sborenst/ansible_agnostic_deployer/blob/master/ansible/configs/ocp-workshop/files/hosts_template.3.9.31.j2#L108)

Can we have this configuration consistently applied in every release? This is required.

nfs tasks role(s) need to be idempotent

I think right now if you run things twice the nfs tasks fail because some of the commands are not idempotent. We probably need to add some checking to ensure that the non-idempotent nfs tasks are skipped.

Replace useage of ec2_remote_facts with ec2_instance_facts

ec2_remote_facts is depreciated and replaced by ec2_instance_facts.

Generates warnings in: ec2_infrastructure_deployment.yml

use official java-s2i image

Now that an official s2i-java image has been released, we should switch to that one. Needs to be present on the environment.
https://blog.openshift.com/red-hat-brings-cloud-native-services-to-every-java-workload/

cc @thoraxe

[bu-workshop] change lab guide definition url

https://raw.githubusercontent.com/osevg/workshopper-content/master/_workshops/roadshow.yml

development should be the default branch and not master

That would avoid errors while developing and can be configured in the repo under "branches"

3scale link in main readme is broken

Broken link in:

* link:./ansible/roles/ocp-workload-3scale-multitenant/readme.adoc[OpenShift 3Scale
 Deployment] - Want to deploy a workload onto your existing OpenShift Cluster? 
  or local instance running on your laptop?  3Scale is an example of one of
   around *30* OpenShift workloads ready to go.

Replace ansible_ssh_user with ansible_user

As well as being depreciated ansible_ssh_user is an anomaly as other protocols such as winrm are also being used

[bu-workshop] cf template hard codes dns zone id

Get rid of result|succeed filter warning

Describe the bug
Ansible 2.9 will deprecate result|succeed to use result is succeed/is not succeed, so there are a lot of warnings while deploying any config.

To Reproduce

Deploy any config that use that filter.

Expected behavior
Clean Ansible output.

Screenshots / logs
[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using result|succeeded use result is succeeded. This feature will be removed in version 2.9. Deprecation warnings can be
disabled by setting deprecation_warnings=False in ansible.cfg.

Verions (please complete the following information):

OS: OSX
AAD version (git log --pretty=oneline -2)
d90e918 (HEAD -> fix_RHS_url, origin/fix_RHS_url) Remove https:// at satellite.example.com entries
9976f07 (upstream/development, origin/development, origin/HEAD, development) OCP4: bump version to 0.12 and fix aws permission issue for IAM user
Ansible (ansible --version)
ansible 2.7.8
cloud provider CLI (aws-cli, azure, ..)
AWS

Replace Ravello API IP definitions for DHCP reserved addresses

The Ravello API has changed.

@dbenoit17 , anywhere you see reservedIp in the ravello_app.py and the j2 templates please replace reservedIp with allocatedIp

add ldap provider variables

currently hardcoded in the host template:

{% if install_idm == "ldap" %}                                                                                                                                          
openshift_master_identity_providers=[{'name': 'ldap', 'challenge': 'true', 'login': 'true', 'kind': 'LDAPPasswordIdentityProvider','attributes': {'id': ['dn'], 'email':
 ['mail'], 'name': ['cn'], 'preferredUsername': ['uid']}, 'bindDN': 'uid=ose-mwl-auth,cn=users,cn=accounts,dc=opentlc,dc=com', 'bindPassword': '{{bindPassword}}', 'ca':
 'ipa-ca.crt','insecure': 'false', 'url': 'ldaps://ipa1.opentlc.com:636/cn=users,cn=accounts,dc=opentlc,dc=com?uid'}]                                                   
{{openshift_master_ldap_ca_file}}                                                                                                                                                                                                                                                                          {% endif %}

provide variables

"create env key" should be part of pre_software for configs where it is required

document cloud provider compatibility per configuration

Each config README file should list the compatible cloud providers if not fully compatible across all providers.

[bu-workshop] quota and project enhancements

disable user project creation
enable quota of quotas
add PVC quota

Default ocp-workload-terminal fails late because of undefined variable

Describe the bug
When deploying ocp4-workshop, the playbook fails at the end if the variable admin_password is not set.

TASK [ocp-workload-terminal : Create OpenShift objects for Terminal workload] ***
Friday 12 April 2019  12:43:51 +0000 (0:00:00.061)       0:40:38.484 ********** 
changed: [clientvm.e435.internal] => (item=./templates/project.j2)
changed: [clientvm.e435.internal] => (item=./templates/service.j2)
changed: [clientvm.e435.internal] => (item=./templates/service_account.j2)
changed: [clientvm.e435.internal] => (item=./templates/role_binding.j2)
fatal: [clientvm.e435.internal]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'admin_password' is undefined\n\nThe error appears to have been in '/tmp/ocp4-workshop-e435/ansible_agnostic_deployer/ansible/roles/ocp-workload-terminal/tasks/workload.yml': line 8, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create OpenShift objects for Terminal workload\n  ^ here\n"}

Expected behavior

The password should be generated and printed using user.info
Or the playbook should fail early with an explicit error message to let user know that the variable is missing.
Or the workload should not be a default workload.

[RFE] mixed-mode repos/subscriptions/etc

Right now it's kind of all-or-nothing. You either use your own repo files, or you use RHN, but you can't apparently use a mix of both.

For example, I'd like to be able to use pre-release which requires a specific repo file, but I want to get the "rest" of RHEL from Amazon's RHUI. Right now we go in and nuke all existing repos and then only lay down our own repo file.

I'd have to basically extract the RHUI repo configs to put into the open.repo file. It's totally doable. It's just not preferred.

[ocp-workload-fuse-ignite] Broker AMQ pod to be replaced by Syndesis AMQ

The workload.yml file for the ocp-workload-fuse-ignite playbook needs to be updated -

Installation of AMQ using the Syndesis AMQ template
Replace pod name with "Syndesis AMQ"

Development services in RHPDS based on AAD should have a parameter for a branch

Currently, you can't really test a change before you've pushed it to the development branch, meaning that everybody and all services in development are potentially impacted until the issue is fixed.

If development services would have a parameter in which you could enter the name of your development branch, you could order your service so that it is provisioned based on this branch and nobody but you would be impacted if something is wrong.

Download of AWS CLI unsuccessful

Received the following error message while attempting to download AWS CLI:

[root@doubleh tmp]# curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:28 --:--:-- 0
curl: (52) Empty reply from server

switch to file-based SSH config implementation

We ran into an issue with SSH and quoting and etc. when testing the bu-workshop playbooks in the OpenTLC environment. I believe that switching "back" to the file-based SSH config implementation may resolve those SSH issues.

I was doing this here, but it somehow did not make it into the current master:

https://github.com/thoraxe/ansible_agnostic_deployer/blob/cns/ansible/main.yml#L43-L76

Then, the ssh_vars file simply has:

ansible_ssh_extra_args: "-F /tmp/{{ env_type }}_{{ guid }}_ssh_conf"

Although I believe you wanted the file in "workdir".

The one caveat is that the cleanup should be amended to delete this ssh config as well (make sure the file is absent).

infra-common-ssh-config-generate doesn't support working without bastion

Essentially a bastion is required even if you don't use or want one.

satellite var misdocumented in ocp-workshop env_vars.yml

Where it reads:

#If using repo_method: satellite, you must set these values as well.

satellite_url: https://satellite.example.com

It should read:
#If using repo_method: satellite, you must set these values as well.

satellite_url: satellite.example.com

or setup of repos task fails with:
'''fatal: [bastion.guid.internal]: FAILED! => changed=false msg: 'invalid literal for int() with base 10: '''''

get rid of bfg-1.12.16.jar

https://github.com/sborenst/ansible_agnostic_deployer/blob/development/ansible/bfg-1.12.16.jar

36K	library
108K	inventory
164K	cloud_providers
4.6M	roles
5.3M	configs
7.2M	workdir
15M	bfg-1.12.16.jar

Installing ocp-ha-labs into own AWS account working, but painful.

Describe the bug
There are issues installing OCP 3.11 into own AWS account.

To get OCP 3.11 to install into my own AWS account without any issues, these are the setting/changes I needed to make:

Here is the script I used:

# Generic Vars
ENVTYPE="ocp-ha-lab"
GUID=test1 

# Cloud Provider Settings
KEYNAME=my-key 
#REGION=ap-southeast-1
REGION=us-east-2
CLOUDPROVIDER=ec2
HOSTZONEID='XXXXYYYZZZZ'
BASESUFFIX='.paas.example.com'

# OCP Vars
NODE_COUNT=1
REPO_VERSION=3.11
OSRELEASE=3.11.51

ansible-playbook ansible/main.yml \
  -e "guid=${GUID}" \
  -e "env_type=${ENVTYPE}" \
  -e "osrelease=${OSRELEASE}" \
  -e "repo_version=${REPO_VERSION}" \
  -e "cloud_provider=${CLOUDPROVIDER}" \
  -e "aws_region=${REGION}" \
  -e "HostedZoneId=${HOSTZONEID}" \
  -e "key_name=${KEYNAME}" \
  -e "subdomain_base_suffix=${BASESUFFIX}" \
  -e "node_instance_count=${NODE_COUNT}" \
  -e "software_to_deploy=openshift" \
  -e "[email protected]" -e"output_dir=../workdir" -e"output_dir=../workdir" \
  -e@../secret.yml \
  -e "install_idm=htpasswd" \
  -e "repo_method=rhn" \
  -e "use_subscription_manager=true" \
  -e "master_instance_count=3" \
  -e "infranode_instance_count=1" \
  -e "opentlc_integration=false" \
   -e "use_own_repos=false" \

I made the following changes to the ansible hosts template:

Added "package_version" to the checks due to error about higher rpm versions
Deactivated the LDAP settings
Activated the htpasswd settings
Deactivated the ipa-ca.crt path

Here is a diff of the changes I made to the ansible hosts template,
"ansible/configs/ocp-ha-lab/files/labs_hosts_template.3.11.51.j2":

$ diff hosts_template.3.11.51.j2 hosts_template.3.11.51.j2.orig
20c20
< openshift_disable_check="disk_availability,memory_availability,docker_image_availability,package_version"
---
> openshift_disable_check="disk_availability,memory_availability,docker_image_availability"
156c156
< # SJB openshift_master_identity_providers=....
---
> openshift_master_identity_providers=[{'name': 'ldap', ....
159,160c159
< openshift_master_identity_providers=[{'name': 'htpasswd_auth', ...
< # SJB - uncommented 
---
> # openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
165c164
< # SJB openshift_master_ldap_ca_file=/root/ipa-ca.crt
---
> openshift_master_ldap_ca_file=/root/ipa-ca.crt

Intermittent Bad Requests for URL from Ravello

The following error occurs occasionally:

An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 684, in <module>
    main()
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 310, in main
    create_app_and_publish(client, module)
  File "/tmp/ansible_OuAa2l/ansible_module_ravello_app.py", line 652, in create_app_and_publish
    app = client.create_application(app)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 539, in create_application
    return self.request('POST', '/applications', app)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 359, in request
    response = self._request(method, path, body, headers)
  File "/usr/lib/python2.7/site-packages/ravello_sdk.py", line 430, in _request
    response.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 893, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://cloud.ravellosystems.com/api/v1/applications

fatal: [localhost -> localhost]: FAILED! => {
    "changed": false, 
    "failed": true, 
    "invocation": {
        "module_name": "ravello_app"
    }, 
    "module_stderr": "Traceback (most recent call last):\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 684, in <module>\n    main()\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 310, in main\n    create_app_and_publish(client, module)\n  File \"/tmp/ansible_OuAa2l/ansible_module_ravello_app.py\", line 652, in create_app_and_publish\n    app = client.create_application(app)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 539, in create_application\n    return self.request('POST', '/applications', app)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 359, in request\n    response = self._request(method, path, body, headers)\n  File \"/usr/lib/python2.7/site-packages/ravello_sdk.py\", line 430, in _request\n    response.raise_for_status()\n  File \"/usr/lib/python2.7/site-packages/requests/models.py\", line 893, in raise_for_status\n    raise HTTPError(http_error_msg, response=self)\nrequests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://cloud.ravellosystems.com/api/v1/applications\n", 
    "module_stdout": "", 
    "msg": "MODULE FAILURE"
}

Usually, running the deployment again will overcome it.

TODO:

Verify requests are being sent correctly during these cases
See if implementing play retries fixes the issue

Hardcoded Satellite Tools Repo Version

@fridim Sat Tools version is hardcoded in command in two files:

https://github.com/sborenst/ansible_agnostic_deployer/blob/ocp-workshop-prod-1.18/ansible/roles/satellite-repositories/tasks/main.yml

https://github.com/sborenst/ansible_agnostic_deployer/blob/ocp-workshop-prod-1.18/ansible/roles/set-repositories/tasks/satellite-repos.yml

Should we make it part of rhel_repos and identify if repo_method==satellite then pick version from somewhere else? It's not frequently updated, but it's not ok it's hardcoded.

scripts folder unneeded?

The inventory script exists in inventory (where it is actively used) and in scripts.

I think we can delete the scripts folder?

Jenkins fails due to old image version 3.7

Describe the bug
jenkins fails in the mulit product demos, due to old jenkins version.

To Reproduce
Steps to reproduce the behavior, ex:
Deploy the MSAC

Expected behavior
jenkins is not provisioned in the CI/CD project

get rid of warnings

[DEPRECATION WARNING]: 'include' for playbook includes. You should use 'import_playbook' instead. This feature will be removed in version 2.8. Deprecation warnings can be disabled by setting
 deprecation_warnings=False in ansible.cfg.                                                                                                                                                   
[DEPRECATION WARNING]: ec2_remote_facts is kept for backwards compatibility but usage is discouraged. The module documentation details page may explain more about this rationale.. This      
feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.                                                       
[DEPRECATION WARNING]: The use of 'include' for tasks has been deprecated. Use 'import_tasks' for static inclusions or 'include_tasks' for dynamic inclusions. This feature will be removed in
 a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.                                                                                 
[DEPRECATION WARNING]: include is kept for backwards compatibility but usage is discouraged. The module documentation details page may explain more about this rationale.. This feature will  
be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.                                                                    
 [WARNING]: Found variable using reserved name: remote_user

Wrong Issue, please delete

Please delete

My fault, sorry.

Intermittent Failure due to Control Path Error

Ansible occasionally fails when attempting to connect to hosts on ravello due to a control socket error. It seems to affect hosts individually, where some hosts will be affected and not others. The issue does not always affect the same hosts, so the application failure is not isolated to a particular play or playbook.

The error only occurs sometimes, and the same environments will usually run correctly without any changes to vars or playbooks.

fatal: [node.rhpds.opentlc.com]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips [date_redacted]
debug1: Reading configuration data /root/ansible_agnostic_deployer/ansible/workdir/ocp-demo-lab_david-an28_ssh_conf
debug1: auto-mux: Trying existing master
debug1: Control socket \"/tmp/node.rhpds.opentlc.com-cloud-user\" does not exist
debug1: Executing proxy command: exec ssh -i /root/ansible_agnostic_deployer/ansible/workdir/david-an28key -W node.rhpds.opentlc.com:22 -q cloud-user@bastionhost-ocpdemolab[id_redacted].srv.ravcloud.com
debug3: timeout: 60000 ms remain after connect
debug1: permanently_set_uid: 0/0 
debug3: Incorrect RSA1 identifier
debug3: Could not load \"/root/ansible_agnostic_deployer/ansible/workdir/david-an28key\" as a RSA1 public key
debug1: identity file /root/ansible_agnostic_deployer/ansible/workdir/david-an28key type 1
debug1: identity file /root/ansible_agnostic_deployer/ansible/workdir/david-an28key-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.6.1
debug1: permanently_drop_suid: 0
Connection timed out during banner exchange
",
    "unreachable": true
}

The keys that it can not load are correctly formatted and located in the same place as when the deployments succeed.