Code Monkey home page Code Monkey logo

eark-validator's Introduction

E-ARK Python Information Package Validator

Core package and command line utility for E-ARK Information Package validation.

The validation core component implements validation rules defined by E-ARK specifications which can be found on the website of the Digital Information LifeCycle Interoperability Standards Board (DILCIS Board):

https://dilcis.eu/specifications/

Quick Start

Pre-requisites

You must be running either a Debian/Ubuntu Linux distribution or Windows Subsystem for Linux on Windows to follow these commands.

If you are running a different Linux distribution you must change the apt commands to your package manager.

For getting Windows Subsystem for Linux up and running, please follow the guide further down and then come back to this step.

Getting up and running with the E-ARK Python Information Package Validator

Setting up the environment

It is recommended that you create a directory for your EARK work. Write the following:

mkdir EARK

To enter the directory use the following command

cd EARK/

To retrieve the source code from Github use the following command:

git clone https://github.com/E-ARK-Software/eark-validator.git

To enter the new directory containing the source code do:

cd eark-validator/

It is recommended that you create a virtual environment for Python. By doing that you avoid "polluting" the host operating system with dynamically fetched dependencies and at the same time it creates a reproducible environment for your validator.

To create a virtual environment we need to install virtualenv (not to be confused with the venv package). But we also need python3-pip to handle our Python packages. Install this by issuing the following command:

sudo apt install python3-pip

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Now we can install the virtual environment with the following command:

sudo apt install python3-virtualenv

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Finally we will need unzip. Install that by doing:

sudo apt install unzip

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Installing the application

Set up a local virtual environment by issuing the following commands (one line at the time):

virtualenv -p python3 venv
source venv/bin/activate

Update pip to ensure you have the latest and install all the packages required:

pip install -U pip
pip install .

You are now able to run the application "ip-check". It will validate an Information Package for you.

Testing a valid package.

You can test a valid package by first retrieving it from the test corpus:

wget https://github.com/DILCISBoard/eark-ip-test-corpus/raw/integration/corpora/csip/metadata/metshdr/CSIP12/valid/mets-xml_metsHdr_agent_TYPE_exist.zip

Unzip the package:

unzip mets-xml_metsHdr_agent_TYPE_exist.zip

Delete the .zip-file you just downloaded:

rm mets-xml_metsHdr_agent_TYPE_exist.zip

Run the ip-check:

ip-check mets-xml_metsHdr_agent_TYPE_exist/

Result:

('Path mets-xml_metsHdr_agent_TYPE_exist/ is dir, struct result is: '
 'StructureStatus.WellFormed')

A note on testing a directory

If the path passed is a directory, it must contain a single folder which contains the information package (and no other files or folders):

user@machine:~$ tree input
<path to directory>
  ├── documentation
  ├── metadata
  ├── METS.ipxml
  ├── representations
  │   └── rep1
  │       ├── data
  │       ├── metadata
  │       └── METS.ipxml
  └── schemas

Installing Windows Subsystem for Linux (WSL)

If you do not have Linux and have not previously used WSL please perform the following steps. You must either be logged in as Administrator on the machine or as a user with Administrator rights on the machine.

Start er command prompt (cmd.exe) and then enter the following command:

wsl --install

Confirm that the app is allowed to make changes to your device. Installation begins.

Confirm once more that an app is allowed to make changes to your device.

Retrieving and installing the necessary components take a while. Please do not reboot or shutdown your computer during this process. Even if it seems stalled, it is working.

Installation concludes with the message: "The requested operation is successful. Changes will not be effective until the system is rebooted."

Please reboot your computer.

After reboot

You will be prompted to create a new "UNIX username". By convention this is often a less than nine character long all-lowercase username. It does not need to match your Windows username.

You will be prompted to set a password.

You are now logged into Ubuntu (the default Linux distribution used by Windows Subsystem for Linux).

Update the system

No matter how fresh the install, there will almost always be updates available. To fetch them write the following:

sudo apt update

And to install them:

sudo apt upgrade

Confirm that you wish to upgrade your packages by pressing Y followed by ENTER

Please resume the guide above.

For Developers

Developers should install the testing dependencies as well, e.g. pytest and using the --editable flag:

pip install -U pip
pip install --editable ".[testing]"

Running tests

You can run unit tests from the project root: pytest ./tests/, or generate test coverage figures by: pytest --cov=ip_validation ./tests/. If you want to see which parts of your code aren't tested then: pytest --cov=ip_validation --cov-report=html ./tests/. After this you can open the file <projectRoot>/htmlcov/index.html in your browser and survey the gory details.

eark-validator's People

Contributors

carlwilson avatar shsdev avatar aebkmd avatar scfkmd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.