acozine / sufia-centos Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 4.0 300 KB

ansible playbook for sufia install on centos 7

Shell 72.55% Ruby 27.45%

sufia-centos's People

Contributors

Stargazers

Watchers

Forkers

mark-dce dunn jcoyne ilessing notch8

sufia-centos's Issues

Add default value for marmotta_256

In marmotta/defaults/main.yml add a value for the sha256 for the marmotta download

add timestamps to template-created files

Add a line to all templates that will add a timestamp when they are created or modified using Ansible.

Build Error [services | download fits zip]: SSL certificate validation error for fits.google.com download.

As DCE and WashU both use CentOS7 environments, I'm posting a record of this error, a work around, and a description of what might be necessary for a longer term solution.

failed: [hydra-head] => {"failed": true}
msg: Failed to validate the SSL certificate for fits.googlecode.com:443. Make sure your managed systems have a valid CA certificate installed. If the website serving the url uses SNI you need python >= 2.7.9 on your managed machine. You can use validate_certs=False if you do not need to confirm the server\s identity but this is unsafe and not recommended Paths checked for this platform: /etc/ssl/certs, /etc/pki/ca-trust/extracted/pem, /etc/pki/tls/certs, /usr/share/ca-certificates/cacert.org, /etc/ansible

FATAL: all hosts have already failed -- aborting

Here's a work-around:

[ansible]/roles/services/tasks/fits.yml:8

get_url: url=https://fits.googlecode.com/files/fits-0.6.2.zip owner={{ install_user }} group={{ install_group }} dest={{ install_path }}/fits-0.6.2.zip
get_url: url=https://fits.googlecode.com/files/fits-0.6.2.zip owner={{ install_user }} group={{ install_group }} dest={{ install_path }}/fits-0.6.2.zip validate_certs=no

A more appropriate solution would be to upgrade CentOS7's python version from 2.7.5 to >=2.7.9, however 2.7.5 is the latest available through Yum. Installing python >=2.7.9 locally (/usr/local/bin) and changing symlinks / alternatives breaks the Yum installer (and other things, so I've read). Adding an alias for "python" in the .bashrc of the "install" user so that "python" points to the /usr/local/bin installation also doesn't work. The Ansible task uses "become: yes", which I believe means it's basically running with a root environment which bypasses the "install" user's alias pointing to the local python 2.7.9 install.

format volume for /opt housekeeping failing again.

We built a brand new box using Mark's mark-dce/vagrant-centos.git cento 7 branch and we are again getting this error:

TASK: [vagrant | temporarily move Guest Additions] **************************** 
changed: [default]

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"err": "mkfs.xfs: cannot open /dev/sdc: Permission denied\n", "failed": true, "rc": 1}
msg: Creating filesystem xfs on device '/dev/sdc' failed

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
          to retry, use: --limit @/Users/nlaviola/vagrant.retry

default                    : ok=2    changed=1    unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Let's talk and see if we can figure out how this has come back.

move alexandria-v2 role to ADRL repository

Since it's all ADRL-specific stuff anyway, we'd like to migrate the alexandria-v2 role to ansible/ in the ADRL repository. Since our playbooks are already in there anyway, the current workflow is to symlink the roles from this repository into that directory. This would let that remain the same, while letting sufia-centos retain its generality; in the future we would be able to add more roles to that directory if we needed to.

cc @ilessing @johnajao

use ansible multiline format instead of \ for line continuation

I find that this format:

module:
  first_parameter: string
  second_parameter: some_long_string_with_{{ a_variable }} 
  third_parameter: some_other_string

makes the scripts easier to read than this format:

module: first_parameter=string second_parameter=some_long_string_with_{{ a_variable }} \
third_parameter=some_other_string

Add a name line to every task

Name lines can be repetitive, but they help make the roles more readable. There are a few instances throughout the repo, e.g. roles/alexandria-v2/tasks/adrl-dm.yml is missing a name in the first task. Find them all, add names.

Add notes about using ansible.start_at_step

I've gotten timeouts a few times when building a server - it would be good to show people how to restart at the failed step:

ansible.start_at_task = "ffmpeg | download libmp3lame source"

when you get something like

TASK: [ffmpeg | download libmp3lame source] *********************************** 
failed: [default] => {"dest": "/opt/install/ffmpeg_sources/lame-3.99.5.tar.gz", "failed": true, "response": "Connection failure: timed out", "state": "absent", "status_code": -1, "url": "http://downloads.sourceforge.net/project/lame/lame/3.99/lame-3.99.5.tar.gz"}
msg: Request failed

FATAL: all hosts have already failed -- aborting

Vagrant branch needs sudo for /opt format and mounting tasks

I'm trying to run the vagrant branch to set up a machine and it fails with

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"err": "mkfs.xfs: cannot open /dev/sdc: Permission denied\n", "failed": true, "rc": 1}
msg: Creating filesystem xfs on device '/dev/sdc' failed

FATAL: all hosts have already failed -- aborting

As far as I can tell, the "format volume for /opt" and "mount volume at /opt" need to be run with elevated privileges, but I don't see that happening anywhere in vagrant_housekeeping.yml, the vagrant/mail.yml, or the vagrant role in vagrant.yml.

Should I not be running the vagrant branch?

add config for uploads and derivatives

Override the curation concerns defaults for the derivatives and uploads directories - this is already done in alexandria, but it is not project-specific.
See curationexperts/alexandria-legacy#635

set redis and resque-pool to restart on reboot

Test all chkconfig settings with a build-then-reboot when this is done.

install redis from source

Current centos package redis is 2.8.19, which has known bugs. Upgrade to at least 3.1.

Fedora sha check fails

The hydra-stack role now fails with:

msg: The SHA-256 checksum for /opt/install/fcrepo-webapp-4.5.0.war did not match b9781a38227b71bb2f61d09938f15d5621d57edcfb94db5e9524abcac6984b14; it was f9c778c4285dcbb7ab228f43e5e30ea7511a5553ca52a34e274e1e390cb5187e.

When I look at the downloads page https://github.com/fcrepo4/fcrepo4/releases I only see SHA1 and MD5 hashes.
@dunn - where did you get the SHA-256 value?

build ffmpeg from a stable version

What's the reason for building the HEAD of the ffmpeg repository instead of the latest stable version?

configure ffmpeg failed, ran fine on second try

running vagrant provision on single_user branch

Vagrantfile (excerpt)

  # Enable provisioning with Ansible
   config.vm.provision "ansible" do |ansible|
     #ansible.verbose = 'vvv'
     ansible.groups = {
       "vagrant" => ["default"],
       "all_groups:children" => ["group1"],
     }
     ansible.extra_vars = {
       deploy_user: "vagrant",
       server_name: "localhost",
       rails_env: "development",
       bundle_path: "~/.bundle"
     }
     ansible.playbook = "provisioning/vagrant-develop.yml"
   end

Got this error

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
failed: [default] => {"changed": true, "cmd": "cd /opt/install/ffmpeg_sources/ffmpeg && ./configure --extra-libs=\"-ldl\" --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264", "delta": "0:00:04.612339", "end": "2015-11-13 17:56:09.910513", "rc": 1, "start": "2015-11-13 17:56:05.298174", "warnings": []}
stdout: ERROR: opus not found using pkg-config

If you think configure made a mistake, make sure you are using the latest
version from Git.  If the latest version fails, report the problem to the
[email protected] mailing list or IRC #ffmpeg on irc.freenode.net.
Include the log file "config.log" produced by configure as this will help
solve the problem.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/mark/vagrant-develop.retry

default                    : ok=129  changed=117  unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

added ansible.start_at_task = "ffmpeg | configure ffmpeg" and re-ran vagrant provision (no other changes)

MARKs-MB:wustl-dev mark$ vagrant provision
==> default: Running provisioner: ansible...

PLAY [provision basic vagrant box] ******************************************** 

GATHERING FACTS *************************************************************** 
ok: [default]

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
changed: [default]

TASK: [ffmpeg | make ffmpeg] **************************************************

etc.

Add a placeholder folder for the jdk file

If you want to have folks put the jdk in ./roles/hydra-stack/files it might be nice to put a dummy directory there in the repo. I tried copying the file there from my downloads before realizing that I had to create the directory first.

refine jar removal for tomcat marmotta

The current implementation (Ansible waits for the file to appear, then removes it, then uses touch to create an empty file with the same name so future runs don't fail on the wait task) feels unwieldy. Explore other options - a straightforward wait delay, or . . . ?

Update README to describe deploy task

Document three user options:

not use Capistrano
use Capistrano without self-deploy
user Capistrano with self-deploy

Should some roles be moved to Ansible Galaxy?

Related to #59 and #77, and the discussion we had once with Scott about levels of generality, should some of the very generic roles be moved into separate repositories and included here using ansible-galaxy? Roles that this might make sense for would be, IMO:

ffmpeg
imagemagick
ruby
passenger
automount

And of course we might find existing roles ~~in the galaxy~~ that do a better job and can depend on them.

error at task: download ruby

I pulled the latest code from the repo vagrant branch and attempted to vagrant up
and got a new error:

TASK: [vagrant | format volume for /opt] ************************************** 
changed: [default]

TASK: [vagrant | mount volume at /opt] **************************************** 
changed: [default]

TASK: [vagrant | restore Guest Additions] ************************************* 
changed: [default]

TASK: [ruby | remove package ruby] ******************************************** 
ok: [default]

TASK: [ruby | download ruby] ************************************************** 
failed: [default] => {"failed": true}
msg: Destination /opt/install not writable

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=17   changed=13   unreachable=0    failed=1

when I go look at /opt in the VM I see it is owned by root
and there is no /opt/install directory

$ ls -ld /opt
drwxr-xr-x 3 root root 37 Jul 28 18:57 /opt

$ ls -la /opt
total 4
drwxr-xr-x   3 root root   37 Jul 28 18:57 .
dr-xr-xr-x. 19 root root 4096 Jul 28 18:55 ..
drwxr-xr-x   9 root root  137 Jul 28 18:28 VBoxGuestAdditions-5.0.0

Need info in readme or wiki about capistrano keys

The current scripts point to key's in Alicia's personal repo - see
https://github.com/acozine/sufia-centos/blob/master/roles/housekeeping/tasks/users_groups_dirs.yml#L28-L31

- name: add keys for capistrano user
  authorized_key: user={{ capistrano_user }} key="{{ item }}" 
  with_items:
      - https://github.com/acozine.keys

Postgres Install Error

ANSIBLE OUTPUT

NOTIFIED: [services | remove postgres log dir if exists] ********************** 
ok: [default]

NOTIFIED: [services | initialize postgres db] ********************************* 
failed: [default] => {"changed": true, "cmd": ["postgresql-setup", "initdb"], "delta": "0:00:00.059702", "end": "2015-07-16 23:32:37.619059", "rc": 1, "start": "2015-07-16 23:32:37.559357", "warnings": []}
stdout: Initializing database ... failed, see /var/lib/pgsql/initdb.log

FATAL: all hosts have already failed -- aborting

/var/lib/pgsql/initdb.log

[vagrant@hydra-devbox ~]$ sudo cat /var/lib/pgsql/initdb.log
/usr/bin/postgresql-setup: line 133: runuser: command not found

/usr/bin/postgresql-setup

... lines 102-107
# For SELinux we need to use 'runuser' not 'su'
if [ -x /sbin/runuser ]; then
    SU=runuser
else
    SU=su
fi
... lines 129-144
    # Initialize the database
    initdbcmd="$PGENGINE/initdb --pgdata='$PGDATA' --auth='ident'"
    initdbcmd+=" $PGSETUP_INITDB_OPTIONS"

    $SU -l postgres -c "$initdbcmd" >> "$PGLOG" 2>&1 < /dev/null

    # Create directory for postmaster log files
    mkdir "$PGDATA/pg_log"
    chown postgres:postgres "$PGDATA/pg_log"
    chmod go-rwx "$PGDATA/pg_log"
    [ -x /sbin/restorecon ] && /sbin/restorecon "$PGDATA/pg_log"

    if [ -f "$PGDATA/PG_VERSION" ]; then
        return 0
    fi
    return 1

Note that lines 136-138 create a log file in $PGDATA='/var/lib/pgsql/data' which is what caused the directory already exists error when we tried to re-run the provisioning script event after uninstalling postgresql*

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
failed: [hydra-head] => {"changed": true, "cmd": "cd /opt/install/ffmpeg_sources/ffmpeg && ./configure --extra-libs=\"-ldl\" --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264", "delta": "0:00:04.605410", "end": "2015-09-09 03:00:32.271602", "rc": 1, "start": "2015-09-09 03:00:27.666192", "warnings": []}
stdout: ERROR: opus not found using pkg-config

If you think configure made a mistake, make sure you are using the latest
version from Git.  If the latest version fails, report the problem to the
[email protected] mailing list or IRC #ffmpeg on irc.freenode.net.
Include the log file "config.log" produced by configure as this will help
solve the problem.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/mark/vanilla.retry

hydra-head                 : ok=144  changed=108  unreachable=0    failed=1

BUT THEN, this works fine

MARKs-MB:sufia-centos mark$ ansible-playbook vanilla.yml --start-at-task="configure ffmpeg"

PLAY [provision vanilla centos 7 host via ssh] ******************************** 

GATHERING FACTS *************************************************************** 
ok: [hydra-head]

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
changed: [hydra-head]

TASK: [ffmpeg | make ffmpeg] ************************************************** 
... ETC.

get rid of ezid.yml

User and password are now in secrets.yml. Put host and port in application.yml. See curationexperts/alexandria-legacy#576 for more info.

device /dev/sdc not found

when I tried to vagrant up I got:

...
TASK: [vagrant | temporarily move Guest Additions] **************************** 
changed: [default]

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"failed": true}
msg: Device /dev/sdc not found.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=13   changed=11   unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

failed at: app-config add alex2 shared templates

see attached log for details but here are the last 4 tasks....

TASK: [app-config | add log rotation for catalina.out (tomcat/fedora/solr logs)] *** 
changed: [default]

TASK: [app-config | create apache vhosts file] ******************************** 
changed: [default]

TASK: [app-config | add alex2 shared config files] **************************** 
changed: [default] => (item=ezid.yml)
changed: [default] => (item=smtp.yml)
changed: [default] => (item=fedora.yml)

TASK: [app-config | add alex2 shared templates] ******************************* 
fatal: [default] => One or more undefined variables: 'item' is undefined

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=87   changed=53   unreachable=1    failed=0

distinguish marmotta_host from server_name

I think I made a comment on a PR somewhere about this, but I can't find it at the moment.

Separate marmotta_host from server_name to allow flexibility in architecture - the marmotta host may not always run on the same box or vm as the rails server.