Code Monkey home page Code Monkey logo

ansible-archivematica-src's People

Contributors

aelkiss avatar amayita avatar hakamine avatar helenst avatar hwesta avatar jhsimpson avatar jraddaoui avatar jrwdunham avatar mamedin avatar misilot avatar mistydemeo avatar r-khera avatar rayzilt avatar replaceafill avatar scollazo avatar sevein avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ansible-archivematica-src's Issues

"Destination directory /etc/archivematica does not exist" error

TASK [archivematica-src : Template gunicorn configuration file] ****************
fatal: [am-local]: FAILED! => {"changed": true, "failed": true, "msg": "Destination directory /etc/archivematica does not exist"}

This was introduced in #73, /etc/archivematica needs to be created when deploying the Storage Service (it had been created only when deploying the pipeline)

In dev, when released use gunicorn's --reload + inotify (v19.7.0)

Note that gunicorn v19.7.0 has not been released yet but it will soon.

gunicorn has added to master support for code reloading using inotify instead of filesystem polling.

This is going to help us save so much energy in our laptops! Currently, we're deploying a couple of gunicorn apps with four workers each (probably uneeded to have so many workers in development). Each worker polls the filesystem every second, generating a stat() syscall for each file in the source code. After a quick look with strace I see about 600 system calls for each poll.

More details here: https://github.com/benoitc/gunicorn/blob/master/docs/source/settings.rst#L255-L277. We'll only need to add the inotify package to our pip dependencies when we're in dev mode. That's a change we can probably make in the sources of the AM and SS repos but I thought this is a best place to keep track of it just in case it involves changes in configuration down the road.

Use packages.archivematica.org for dependencies

Currently this role is adding ppa:archivematica/externals-dev as a source, it should also be added packages.archivematica.org. Probably should have the option to use ppa:archivematica/externals instead of externals-dev when doing a production install.

The externals-dev ppa shows a warning on launchpad, and it appears that it is not working properly. I noticed that in one test, I got siegfried 1.1.0 installed, when I should have been able to get 1.4.5 from externals-dev. packages.archivematica.org/1.5.x has siegfried 1.5.0, which is the version that should be installed with Archivematica 1.5.0 and greater.

Broken tasks: pip install fails if lib/ is missing

Two pip tasks use --find-links lib which break if lib/ is missing. This is now occurring in stable/0.9.x and qa/0.x.

- name: "Create virtualenv for archivematica-storage-service, pip install requirements"
  pip:
    chdir: "{{ archivematica_src_dir }}/archivematica-storage-service"
    requirements: "requirements.txt"
    virtualenv: "/usr/share/python/archivematica-storage-service"
    state: "latest"
  tags: "amsrc-ss-pydep"

- name: "Work around to install pip deps commented out in old SS branches"
  pip:
    chdir: "{{ archivematica_src_dir }}/archivematica-storage-service"
    virtualenv: "/usr/share/python/archivematica-storage-service"
    extra_args: "--find-links lib"
    state: "latest"
    name: "{{ item }}"
  with_items:
    - "python-swiftclient"
    - "python-keystoneclient"
    - "sword2"
    - "pyopenssl"
    - "ndg-httpsclient"
    - "pyasn1"
  when: "archivematica_src_ss_pip_missing_deps"
  tags: "amsrc-ss-pydep"

This is the solution I have tested:

- stat:
    path: "{{ archivematica_src_dir }}/archivematica-storage-service/lib"
  register: "ss_lib_dir_check"
- set_fact:
    ss_pip_install_extra_args: ""
- set_fact:
    ss_pip_install_extra_args: "--find-links lib"
  when: "ss_lib_dir_check.stat.isdir is defined and ss_lib_dir_check.stat.isdir"

- name: "Create virtualenv for archivematica-storage-service, pip install requirements"
  pip:
    chdir: "{{ archivematica_src_dir }}/archivematica-storage-service"
    requirements: "requirements.txt"
    virtualenv: "/usr/share/python/archivematica-storage-service"
    extra_args: "{{ ss_pip_install_extra_args }}"
    state: "latest"
  tags: "amsrc-ss-pydep"

- name: "Work around to install pip deps commented out in old SS branches"
  pip:
    chdir: "{{ archivematica_src_dir }}/archivematica-storage-service"
    virtualenv: "/usr/share/python/archivematica-storage-service"
    extra_args: "{{ ss_pip_install_extra_args }}"
    state: "latest"
    name: "{{ item }}"
  with_items:
    - "python-swiftclient"
    - "python-keystoneclient"
    - "sword2"
    - "pyopenssl"
    - "ndg-httpsclient"
    - "pyasn1"
  when: "archivematica_src_ss_pip_missing_deps"
  tags: "amsrc-ss-pydep"

storage.db owner issues

storage.db file ownership is sometimes not correctly set (root:root vs archivematica:archivematica), producing nginx/uwsgi errors.

error: jinja2 filters not found when using ansible 2.4.0

Having the following error when executing a task in common.yml (used stable/1.6.x branch of the role but should also occur in qa/1.7.x):

TASK [external-roles/artefactual.archivematica-src : Expand archivematica_src_dir] **********************************
task path: /media/sf_vbox-ubuntu14/repos/gitlab.artefactual/ops-deployment/envs/ccarchitecture/external-roles/artefactual.archivematica-src/tasks/common.yml:65
fatal: [VSP-AMSS-01]: FAILED! => {
    "failed": true, 
    "msg": "template error while templating string: no filter named 'expanduser'. String: {{ archivematica_src_dir|expanduser }}"
}

Running ansible from a virtualenv in ubuntu 16, installed using pip. Versions:

$ pip list --format=columns
Package       Version
------------- -------
ansible       2.4.0.0
asn1crypto    0.23.0 
bcrypt        3.1.4  
cffi          1.11.2 
cryptography  2.1.1  
enum34        1.1.6  
idna          2.6    
ipaddress     1.0.18 
Jinja2        2.9.6  
MarkupSafe    1.0    
paramiko      2.3.1  
pip           9.0.1  
pkg-resources 0.0.0  
pyasn1        0.3.7  
pycparser     2.18   
PyNaCl        1.1.2  
PyYAML        3.12   
setuptools    36.6.0 
six           1.11.0 
wheel         0.30.0 

Problem: Running new npm tasks fails using qa/1.7.x

When trying to deploy the new appraisal tab and transfer browser features I found an issue when running the npm tasks. It was trying to run the npm tasks as user root, using /root/.npm directory.

I could deploy this tasks as archivematica user, although I had to change the owner of /opt/archivematica previously.

The following code was used in the tasks/pipeline-instcode.yml file:

#
# front-end
#

#npm needs to be running as user archivematica
#The owner of archivematica_src_dir has to be changed to archivematica
- name: "Change archivematica-source owner to archivematica"
  file:
    dest: "{{ archivematica_src_dir }}"
    state: "directory"
    owner: "archivematica"
    group: "archivematica"
    recurse: "yes"

- name: "Install front-end dependencies"
  become: "yes"
  become_user: "archivematica"
  npm:
    path: "{{ item }}"
    state: "present"
  with_items:
    - "{{ archivematica_src_dir }}/archivematica/src/dashboard/frontend/appraisal-tab"
    - "{{ archivematica_src_dir }}/archivematica/src/dashboard/frontend/transfer-browser"
  when:
    - "ansible_env.USER != 'vagrant'"

I don't know if it is better to change the archivematica_src_dir owner at another point of the deployment.

To branch or not to branch?

This is a question related with #104 which is a big PR that introduces several changes to the role. Would it be a good thing to have multiple stable branches in this repository, like the archivematica repository? For example, creating a stable/1.5.x and a stable/1.6.x branch, so that the ansible role is paired to the target archivematica branch.
The reason for this is that we may be reaching a point where it may no longer be possible or practical to assure that changes introduced in the role will be backward compatible with older branches of archivematica.
(I found this approach is used for example in https://github.com/elastic/ansible-elasticsearch, there is a stable/2.x branch to deploy ES 2.x and a stable/5.x branch to deploy ES 5.x)

Conver this repo into normal instead of fork

I wonder if we can ask GitHub to convert this repo into normal mode instead of fork mode (hakamine's repo is the parent). That would allow us to make searches, which is currently not possible as forks are not searchable?

Copying storage service files fails when rerunning scripts

After the changes in a51f1a8, it's no longer possible to rerun this on a machine that's already been deployed. This occurs when trying to copy storage.ini, like so:

TASK: [archivematica-src | copy archivematica-storage-service source files] ***
failed: [am-local] => (item={'dest': '/etc/uwsgi/apps-available/storage.ini', 'src': '/srv/archivematica-storage-service/install/storage.ini'}) => {"failed": true, "gid": 0, "group": "root", "item": {"dest": "/etc/uwsgi/apps-available/storage.ini", "src": "/srv/archivematica-storage-service/install/storage.ini"}, "mode": "0644", "owner": "root", "path": "/etc/uwsgi/apps-available/storage.ini", "size": 969, "state": "file", "uid": 0}
msg: refusing to convert between file and link for /etc/uwsgi/apps-available/storage.ini

npm installation failure when trying to install appraisal tab

npm package provided by default by ubuntu trusty has broken dependencies and can't be installed

# apt-get install npm
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 npm : Depends: nodejs but it is not going to be installed
       Depends: node-abbrev (>= 1.0.4) but it is not going to be installed
       Depends: node-ansi but it is not going to be installed
       Depends: node-archy but it is not going to be installed
       Depends: node-block-stream but it is not going to be installed
       Depends: node-fstream (>= 0.1.22) but it is not going to be installed
       Depends: node-fstream-ignore but it is not going to be installed
       Depends: node-github-url-from-git but it is not going to be installed
       Depends: node-glob (>= 3.1.21) but it is not going to be installed
       Depends: node-graceful-fs (>= 2.0.0) but it is not going to be installed
       Depends: node-inherits but it is not going to be installed
       Depends: node-ini (>= 1.1.0) but it is not going to be installed
       Depends: node-lockfile but it is not going to be installed
       Depends: node-lru-cache (>= 2.3.0) but it is not going to be installed
       Depends: node-minimatch (>= 0.2.11) but it is not going to be installed
       Depends: node-mkdirp (>= 0.3.3) but it is not going to be installed
       Depends: node-gyp (>= 0.10.9) but it is not going to be installed
       Depends: node-nopt (>= 2.1.1) but it is not going to be installed
       Depends: node-npmlog but it is not going to be installed
       Depends: node-once but it is not going to be installed
       Depends: node-osenv but it is not going to be installed
       Depends: node-read but it is not going to be installed
       Depends: node-read-package-json (>= 1.1.0) but it is not going to be installed
       Depends: node-request (>= 2.25.0) but it is not going to be installed
       Depends: node-retry but it is not going to be installed
       Depends: node-rimraf (>= 2.2.2) but it is not going to be installed
       Depends: node-semver (>= 2.1.0) but it is not going to be installed
       Depends: node-sha but it is not going to be installed
       Depends: node-slide but it is not going to be installed
       Depends: node-tar (>= 0.1.18) but it is not going to be installed
       Depends: node-which but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

Suggest to upgrade the role to install instead using the recommended procedure at https://nodejs.org/en/download/package-manager/#debian-and-ubuntu-based-linux-distributions

File tasks break when archivematica_src_install_ss set to false

In dashboard.conf (Nginx) /media should serve contents inside /static

I think that we should run collectstatic in the dashboard to put all the assets under /static.

This is basically the change I am suggesting:

   location /media {
-    alias /usr/share/archivematica/dashboard/media;
+    alias /usr/share/archivematica/dashboard/static;
   }

The problem with making this change is that we'd need an extra location for development environments pointing to /usr/share/archivematica/dashboard/media - this method however has some disadvantages like not allowing to locate assets in more than one subdirectory (Django apps). Ideally, we would serve the static assets from Django when doing development but that would require a couple of changes in the code. This is the approach that I'd prefer but I'm not sure if it's doable to perform such change across all our active code branches.

Basically we would be making two changes:

  1. Add STATIC_URL = "/static/" to settings/local.py and,
  2. Add a set of extra urls to urls.py only when settings.DEBUG = True

Log configuration and file missing for mcp-client and server

In the qa/1.7 branch there is no longer a logfile placed in
/var/log/archivematica/MCPServer/
/var/log/archivematica/MCPClient/

For the dashboard this was amendable via:
archivematica_src_am_dashboard_environment:
SS_GUNICORN_ACCESSLOG: "/var/log/archivematica/storage-service/gunicorn.access_log"
SS_GUNICORN_ERRORLOG: "/var/log/archivematica/storage-service/gunicorn.error_log"
SS_GUNICORN_LOGLEVEL: "info"

But how to proceed for the MCP server and client?

gunicorn config: add option to disable sendfile

Have experienced errors when downloading big SIPs from the Storage Service when sendfile is enabled in gunicorn (default behaviour). Adding the --no-sendfile option when running Storage Service's gunicorn fixes it.

allow mysql password to be reset for archivematica user

Archivematica connects to mysql using credentials stored in /etc/archivematica/archivematicaCommon/dbsettings .

This ansible role should have a way to change the username and/or password used, and update the config file used by archivematica.

service files are not installed in ubuntu trusty

Using ansible 2.4.1 and branch stable/1.6.x, init scrips are not installed.

This is due to the conditino "ansible_service_manager = upstart" being false, as ansible gives " "ansible_service_mgr": "service""

Archivematica source checkout hangs forever if target path already exists

If the archivematica_src_dir variable points to a location outside the vagrant VM, for example, /vagrant/src, and that location already exists and already has an Archivematica source checkout, Vagrant will hang forever while performing the checkout. This may only happen if the checkout in /vagrant/src has local changes.

change source of os dependencies based on am version

https://github.com/artefactual-labs/ansible-role-archivematica-src/blob/master/tasks/tear-up.yml#L17

In that line, the externals-dev ppa is added to the list of ubuntu trusted repos.

For qa releases, this is good, but for stable/1.4.x AM, the 1.4 ppa should be added instead.

For stable/1.5.x, the packages.archivematica.org repo for 1.5 should be added, instead of a ppa.

This line should probably change its behaviour depending on which version of AM is being installed.

Problem: no environment file installed with the role

In the rpm and deb packages, a /etc/defauilt/archivematica- file is installed, with the environment vars needed by archivematica to boot.

we should do the same in the ansible role, using /etc/default/ for debian based system, and /etc/sysconfig for rhel/centos.

update installed mediainfo versions

Versions of mediainfo provided by ubuntu default repositories are old. Instead, install updated versions from ppa:djcj/mediainfo (which provide newer versions) or downloading packages from https://mediaarea.net/en/MediaInfo/Download/Ubuntu (which has the latest). In either case be sure to install not only package mediainfo, but also the required dependencies libmediainfo0, libzen0 (missing to do this causes mediainfo errors )

Problem: no env vars for clamav config

On latest AM qa/1.x there are some settings for clamav:

ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_SERVER
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_PASS_BY_REFERENCE
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_TIMEOUT
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_CLIENT_BACKEND
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_MAX_FILE_SIZE
ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CLAMAV_MAX_SCAN_SIZE

These variables should be added to the templates and then, these env vars could be modified using the vars.yml file.

For example, for RH/CentOS, this could be added to the /etc/sysconfig/archivematica-mcp-client template.

gunicorn: use async workers

In worker_class, use any of the following options: eventlet, gevent, tornado, gthread. Not considering asyncio because it's Py3 only.
I think it's preferably to use one of the first two options for various reasons. Between the two, I'm inclined to prefer eventlet because it does carry simpler dependencies (eventlet). Ultimately the user could decide which worker class to use.

The reasons for moving to async workers are obvious. One example that makes the underlying issue easier to understand: say you deploy SS with two workers. If one of the workers is busy serving a file to the user, you just compromised 50% of the availability of the server. If the case was that you deployed the application with a single worker, the application will become unavailable to the rest of users. In general, our applications (dashboard and ss) should be able to make use of these worker classes with no changes but some testing is recommended.

problem: qa/1.7.x branch error on role dependency (geerlingguy.nodejs)

Getting an error like this when the role dependency is executing:

TASK [geerlingguy.nodejs : Create npm global directory] ********************************************************
...
    "mode": "0755",
    "msg": "chown failed: failed to look up user hector",
    "owner": "root",
    "path": "/usr/local/lib/npm",
    "size": 4096,
    "state": "directory",
    "uid": 0
}

Problem: clean installs fails using stable/1.6.x

When using the stable/1.6.x ansible branch, I get this error:
virtualbox-ovf: TASK [archivematica-src : include variables from retrieved dependencies files in namespace storage_service] *** virtualbox-ovf: fatal: [default]: FAILED! => {"failed": true, "msg": "No filename was specified to include.\n\nThe error appears to have been in '/home/username/workspace/deployment/packer/templates/vagrant-box-archivematica/provisioning/roles/archivematica-src/tasks/ss-osdeps.yml': line 35, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n# create a namespace for osdeps file variables\n- name: \"include variables from retrieved dependencies files in namespace storage_service\"\n ^ here\n"}

AM branch is stable/1.6.x and ss is stable/0.10.x. I'm using ansible 2.1.4.0

Problem: Upgrading CentOS using latest qa/1.7.x

I have found this error when upgrading CentOS using qa/1.7.x ,commit faefe63

fatal: [arch01]: FAILED! => {"changed": false, "cmd": "./manage.py collectstatic --noinput --pythonpath=/usr/lib/archivematica/archivematicaCommon", "failed": true, "msg": "\n:stderr: Traceback (most recent call last):\n  File \"./manage.py\", line 10, in <module>\n    execute_from_command_line(sys.argv)\n  File \"/usr/share/python/archivematica-dashboard/lib/python2.7/site-packages/django/core/management/__init__.py\", line 354, in execute_from_command_line\n    utility.execute()\n  File \"/usr/share/python/archivematica-dashboard/lib/python2.7/site-packages/django/core/management/__init__.py\", line 303, in execute\n    settings.INSTALLED_APPS\n  File \"/usr/share/python/archivematica-dashboard/lib/python2.7/site-packages/django/conf/__init__.py\", line 48, in __getattr__\n    self._setup(name)\n  File \"/usr/share/python/archivematica-dashboard/lib/python2.7/site-packages/django/conf/__init__.py\", line 44, in _setup\n    self._wrapped = Settings(settings_module)\n  File \"/usr/share/python/archivematica-dashboard/lib/python2.7/site-packages/django/conf/__init__.py\", line 92, in __init__\n    mod = importlib.import_module(self.SETTINGS_MODULE)\n  File \"/usr/lib64/python2.7/importlib/__init__.py\", line 37, in import_module\n    __import__(name)\nImportError: No module named production\n", "path": "/usr/share/python/archivematica-dashboard/bin:/sbin:/bin:/usr/sbin:/usr/bin", "state": "absent", "syspath": ["/tmp/ansible_jtQwyY", "/tmp/ansible_jtQwyY/ansible_modlib.zip", "/tmp/ansible_jtQwyY/ansible_modlib.zip", "/usr/lib64/python27.zip", "/usr/lib64/python2.7", "/usr/lib64/python2.7/plat-linux2", "/usr/lib64/python2.7/lib-tk", "/usr/lib64/python2.7/lib-old", "/usr/lib64/python2.7/lib-dynload", "/usr/lib64/python2.7/site-packages", "/usr/lib/python2.7/site-packages"]}
    to retry, use: --limit @/home/maml/artefactual/deploymentlast4/deployment/envs/denver/arch01.retry

PLAY RECAP *********************************************************************
arch01                     : ok=73   changed=9    unreachable=0    failed=1```

The deploy worked using the commit 79f8a4cab2f0ae53fd5698b739a0338d6b3deb04

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.