Code Monkey home page Code Monkey logo

sufia-centos's People

Contributors

dunn avatar jcoyne avatar mark-dce avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sufia-centos's Issues

Build Error [services | download fits zip]: SSL certificate validation error for fits.google.com download.

As DCE and WashU both use CentOS7 environments, I'm posting a record of this error, a work around, and a description of what might be necessary for a longer term solution.


failed: [hydra-head] => {"failed": true}
msg: Failed to validate the SSL certificate for fits.googlecode.com:443. Make sure your managed systems have a valid CA certificate installed. If the website serving the url uses SNI you need python >= 2.7.9 on your managed machine. You can use validate_certs=False if you do not need to confirm the server\s identity but this is unsafe and not recommended Paths checked for this platform: /etc/ssl/certs, /etc/pki/ca-trust/extracted/pem, /etc/pki/tls/certs, /usr/share/ca-certificates/cacert.org, /etc/ansible

FATAL: all hosts have already failed -- aborting


Here's a work-around:

[ansible]/roles/services/tasks/fits.yml:8

A more appropriate solution would be to upgrade CentOS7's python version from 2.7.5 to >=2.7.9, however 2.7.5 is the latest available through Yum. Installing python >=2.7.9 locally (/usr/local/bin) and changing symlinks / alternatives breaks the Yum installer (and other things, so I've read). Adding an alias for "python" in the .bashrc of the "install" user so that "python" points to the /usr/local/bin installation also doesn't work. The Ansible task uses "become: yes", which I believe means it's basically running with a root environment which bypasses the "install" user's alias pointing to the local python 2.7.9 install.

format volume for /opt housekeeping failing again.

We built a brand new box using Mark's mark-dce/vagrant-centos.git cento 7 branch and we are again getting this error:

TASK: [vagrant | temporarily move Guest Additions] **************************** 
changed: [default]

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"err": "mkfs.xfs: cannot open /dev/sdc: Permission denied\n", "failed": true, "rc": 1}
msg: Creating filesystem xfs on device '/dev/sdc' failed

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
          to retry, use: --limit @/Users/nlaviola/vagrant.retry

default                    : ok=2    changed=1    unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Let's talk and see if we can figure out how this has come back.

move alexandria-v2 role to ADRL repository

Since it's all ADRL-specific stuff anyway, we'd like to migrate the alexandria-v2 role to ansible/ in the ADRL repository. Since our playbooks are already in there anyway, the current workflow is to symlink the roles from this repository into that directory. This would let that remain the same, while letting sufia-centos retain its generality; in the future we would be able to add more roles to that directory if we needed to.

cc @ilessing @johnajao

use ansible multiline format instead of \ for line continuation

I find that this format:

module:
  first_parameter: string
  second_parameter: some_long_string_with_{{ a_variable }} 
  third_parameter: some_other_string

makes the scripts easier to read than this format:

module: first_parameter=string second_parameter=some_long_string_with_{{ a_variable }} \
third_parameter=some_other_string

Add a name line to every task

Name lines can be repetitive, but they help make the roles more readable. There are a few instances throughout the repo, e.g. roles/alexandria-v2/tasks/adrl-dm.yml is missing a name in the first task. Find them all, add names.

Add notes about using ansible.start_at_step

I've gotten timeouts a few times when building a server - it would be good to show people how to restart at the failed step:

ansible.start_at_task = "ffmpeg | download libmp3lame source"

when you get something like

TASK: [ffmpeg | download libmp3lame source] *********************************** 
failed: [default] => {"dest": "/opt/install/ffmpeg_sources/lame-3.99.5.tar.gz", "failed": true, "response": "Connection failure: timed out", "state": "absent", "status_code": -1, "url": "http://downloads.sourceforge.net/project/lame/lame/3.99/lame-3.99.5.tar.gz"}
msg: Request failed

FATAL: all hosts have already failed -- aborting

Vagrant branch needs sudo for /opt format and mounting tasks

I'm trying to run the vagrant branch to set up a machine and it fails with

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"err": "mkfs.xfs: cannot open /dev/sdc: Permission denied\n", "failed": true, "rc": 1}
msg: Creating filesystem xfs on device '/dev/sdc' failed

FATAL: all hosts have already failed -- aborting

As far as I can tell, the "format volume for /opt" and "mount volume at /opt" need to be run with elevated privileges, but I don't see that happening anywhere in vagrant_housekeeping.yml, the vagrant/mail.yml, or the vagrant role in vagrant.yml.

Should I not be running the vagrant branch?

Fedora sha check fails

The hydra-stack role now fails with:

msg: The SHA-256 checksum for /opt/install/fcrepo-webapp-4.5.0.war did not match b9781a38227b71bb2f61d09938f15d5621d57edcfb94db5e9524abcac6984b14; it was f9c778c4285dcbb7ab228f43e5e30ea7511a5553ca52a34e274e1e390cb5187e.

When I look at the downloads page https://github.com/fcrepo4/fcrepo4/releases I only see SHA1 and MD5 hashes.
@dunn - where did you get the SHA-256 value?

configure ffmpeg failed, ran fine on second try

running vagrant provision on single_user branch

Vagrantfile (excerpt)

  # Enable provisioning with Ansible
   config.vm.provision "ansible" do |ansible|
     #ansible.verbose = 'vvv'
     ansible.groups = {
       "vagrant" => ["default"],
       "all_groups:children" => ["group1"],
     }
     ansible.extra_vars = {
       deploy_user: "vagrant",
       server_name: "localhost",
       rails_env: "development",
       bundle_path: "~/.bundle"
     }
     ansible.playbook = "provisioning/vagrant-develop.yml"
   end

Got this error

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
failed: [default] => {"changed": true, "cmd": "cd /opt/install/ffmpeg_sources/ffmpeg && ./configure --extra-libs=\"-ldl\" --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264", "delta": "0:00:04.612339", "end": "2015-11-13 17:56:09.910513", "rc": 1, "start": "2015-11-13 17:56:05.298174", "warnings": []}
stdout: ERROR: opus not found using pkg-config

If you think configure made a mistake, make sure you are using the latest
version from Git.  If the latest version fails, report the problem to the
[email protected] mailing list or IRC #ffmpeg on irc.freenode.net.
Include the log file "config.log" produced by configure as this will help
solve the problem.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/mark/vagrant-develop.retry

default                    : ok=129  changed=117  unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

added ansible.start_at_task = "ffmpeg | configure ffmpeg" and re-ran vagrant provision (no other changes)

MARKs-MB:wustl-dev mark$ vagrant provision
==> default: Running provisioner: ansible...

PLAY [provision basic vagrant box] ******************************************** 

GATHERING FACTS *************************************************************** 
ok: [default]

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
changed: [default]

TASK: [ffmpeg | make ffmpeg] ************************************************** 

etc.

Add a placeholder folder for the jdk file

If you want to have folks put the jdk in ./roles/hydra-stack/files it might be nice to put a dummy directory there in the repo. I tried copying the file there from my downloads before realizing that I had to create the directory first.

refine jar removal for tomcat marmotta

The current implementation (Ansible waits for the file to appear, then removes it, then uses touch to create an empty file with the same name so future runs don't fail on the wait task) feels unwieldy. Explore other options - a straightforward wait delay, or . . . ?

Should some roles be moved to Ansible Galaxy?

Related to #59 and #77, and the discussion we had once with Scott about levels of generality, should some of the very generic roles be moved into separate repositories and included here using ansible-galaxy? Roles that this might make sense for would be, IMO:

  • ffmpeg
  • imagemagick
  • ruby
  • passenger
  • automount

And of course we might find existing roles in the galaxy that do a better job and can depend on them.

error at task: download ruby

I pulled the latest code from the repo vagrant branch and attempted to vagrant up
and got a new error:

TASK: [vagrant | format volume for /opt] ************************************** 
changed: [default]

TASK: [vagrant | mount volume at /opt] **************************************** 
changed: [default]

TASK: [vagrant | restore Guest Additions] ************************************* 
changed: [default]

TASK: [ruby | remove package ruby] ******************************************** 
ok: [default]

TASK: [ruby | download ruby] ************************************************** 
failed: [default] => {"failed": true}
msg: Destination /opt/install not writable

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=17   changed=13   unreachable=0    failed=1   

when I go look at /opt in the VM I see it is owned by root
and there is no /opt/install directory

$ ls -ld /opt
drwxr-xr-x 3 root root 37 Jul 28 18:57 /opt

$ ls -la /opt
total 4
drwxr-xr-x   3 root root   37 Jul 28 18:57 .
dr-xr-xr-x. 19 root root 4096 Jul 28 18:55 ..
drwxr-xr-x   9 root root  137 Jul 28 18:28 VBoxGuestAdditions-5.0.0

Postgres Install Error

ANSIBLE OUTPUT

NOTIFIED: [services | remove postgres log dir if exists] ********************** 
ok: [default]

NOTIFIED: [services | initialize postgres db] ********************************* 
failed: [default] => {"changed": true, "cmd": ["postgresql-setup", "initdb"], "delta": "0:00:00.059702", "end": "2015-07-16 23:32:37.619059", "rc": 1, "start": "2015-07-16 23:32:37.559357", "warnings": []}
stdout: Initializing database ... failed, see /var/lib/pgsql/initdb.log

FATAL: all hosts have already failed -- aborting

/var/lib/pgsql/initdb.log

[vagrant@hydra-devbox ~]$ sudo cat /var/lib/pgsql/initdb.log
/usr/bin/postgresql-setup: line 133: runuser: command not found

/usr/bin/postgresql-setup

... lines 102-107
# For SELinux we need to use 'runuser' not 'su'
if [ -x /sbin/runuser ]; then
    SU=runuser
else
    SU=su
fi
... lines 129-144
    # Initialize the database
    initdbcmd="$PGENGINE/initdb --pgdata='$PGDATA' --auth='ident'"
    initdbcmd+=" $PGSETUP_INITDB_OPTIONS"

    $SU -l postgres -c "$initdbcmd" >> "$PGLOG" 2>&1 < /dev/null

    # Create directory for postmaster log files
    mkdir "$PGDATA/pg_log"
    chown postgres:postgres "$PGDATA/pg_log"
    chmod go-rwx "$PGDATA/pg_log"
    [ -x /sbin/restorecon ] && /sbin/restorecon "$PGDATA/pg_log"

    if [ -f "$PGDATA/PG_VERSION" ]; then
        return 0
    fi
    return 1

Note that lines 136-138 create a log file in $PGDATA='/var/lib/pgsql/data' which is what caused the directory already exists error when we tried to re-run the provisioning script event after uninstalling postgresql*

iptables resurgence on EC2

A recent restart of wustl.curationexperts.com caused iptables rules to pop back up. Add a role for setting iptables rules.

ffmpeg configure task fails on first run

On a vanilla centos server, I'm getting a failure on the "configure ffmpeg task", but if you run the script again starting at that same step, it runs without error and the playbook finishes successfully.

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
failed: [hydra-head] => {"changed": true, "cmd": "cd /opt/install/ffmpeg_sources/ffmpeg && ./configure --extra-libs=\"-ldl\" --enable-gpl --enable-nonfree --enable-libfdk_aac --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264", "delta": "0:00:04.605410", "end": "2015-09-09 03:00:32.271602", "rc": 1, "start": "2015-09-09 03:00:27.666192", "warnings": []}
stdout: ERROR: opus not found using pkg-config

If you think configure made a mistake, make sure you are using the latest
version from Git.  If the latest version fails, report the problem to the
[email protected] mailing list or IRC #ffmpeg on irc.freenode.net.
Include the log file "config.log" produced by configure as this will help
solve the problem.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/mark/vanilla.retry

hydra-head                 : ok=144  changed=108  unreachable=0    failed=1   

BUT THEN, this works fine

MARKs-MB:sufia-centos mark$ ansible-playbook vanilla.yml --start-at-task="configure ffmpeg"

PLAY [provision vanilla centos 7 host via ssh] ******************************** 

GATHERING FACTS *************************************************************** 
ok: [hydra-head]

TASK: [ffmpeg | configure ffmpeg] ********************************************* 
changed: [hydra-head]

TASK: [ffmpeg | make ffmpeg] ************************************************** 
... ETC.

device /dev/sdc not found

when I tried to vagrant up I got:

...
TASK: [vagrant | temporarily move Guest Additions] **************************** 
changed: [default]

TASK: [vagrant | format volume for /opt] ************************************** 
failed: [default] => {"failed": true}
msg: Device /dev/sdc not found.

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=13   changed=11   unreachable=0    failed=1   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

failed at: app-config add alex2 shared templates

see attached log for details but here are the last 4 tasks....

TASK: [app-config | add log rotation for catalina.out (tomcat/fedora/solr logs)] *** 
changed: [default]

TASK: [app-config | create apache vhosts file] ******************************** 
changed: [default]

TASK: [app-config | add alex2 shared config files] **************************** 
changed: [default] => (item=ezid.yml)
changed: [default] => (item=smtp.yml)
changed: [default] => (item=fedora.yml)

TASK: [app-config | add alex2 shared templates] ******************************* 
fatal: [default] => One or more undefined variables: 'item' is undefined

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
           to retry, use: --limit @/Users/ilessing/vagrant.retry

default                    : ok=87   changed=53   unreachable=1    failed=0   

distinguish marmotta_host from server_name

I think I made a comment on a PR somewhere about this, but I can't find it at the moment.

Separate marmotta_host from server_name to allow flexibility in architecture - the marmotta host may not always run on the same box or vm as the rails server.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.