Code Monkey home page Code Monkey logo

fairy-data-provider's Introduction

FAIRY

A Java command line tool to make datasets findable on a global scale in a FAIR way

DOI language license

This tool can be used to make datasets findable on the World Wide Web.
FAIRY creates landing pages for the provided datasets. These landing pages contain Schema.org markup with properties proposed by the Bioschemas.org dataset profile. The landing pages and the dataset markup can then be found by search engines.

The website hosting all the landing pages is the FAIRY Website.
All dataset landing pages are accessible from this page: Dataset Navigation Page

How to run FAIRY

Download

FAIRY can be run by downloading the compiled and executable Java binary (JAR) that is available on this repository. This JAR file is called FAIRY-1.0.0-jar-with-dependencies.jar and can be executed via the command line. For this to work, the command has to be executed in the same directory that the JAR file is stored in. With this JAR file, FAIRY can be used with a few easy commands and options.

It is also possible to use FAIRY by downloading the source code and building it locally on your computer, for example with an IDE.

Requirements

To use the tool, you need to be allowed to connect to the server that hosts the landing pages. A connection to this server is also only possible from within the VPN of the University of Tübingen.

If you are allowed to transfer files to the server, you will automatically authenticate with ssh key authentication by only providing your username. For this to work, your private and public keys need to be saved in the default locations ~/.ssh/id_rsa and ~/.ssh/id_rsa.pub.

Before running FAIRY the first time, the hosts fingerprint needs to be added to your known_hosts file. Here, the host is fair.qbic.uni-tuebingen.de. To add this host to your known_hosts file, the following line can be run in the command line:

ssh-keyscan -t rsa fair.qbic.uni-tuebingen.de >> ~/.ssh/known_hosts

Additionally, you need to have Java JRE or JDK installed to run the tool.

How to use FAIRY

Since FAIRY is a command line tool, it can be run by providing commands and options together with the FAIRY JAR in the command line. To get an overview of all available options and commands, java -jar fairy.jar -h can be run. This produces the following output:

Usage: [COMMAND] -u=<username> -f=<tsvFile> [-hV]
-f, --file=<tsvFile>    The path to a tsv file describing dataset metadata
-h, --help              Show this help message and exit.
-u, --username=<user>   Username for connecting to the server
-V, --version           Print version information and exit.
Commands:
create  Create a new landing page for datasets

The Create Command

To create landing pages for datasets, the create command needs to be used. For this command to work, there are two options necessary. The path to a file needs to be provided with the -f or --file option. Also, a username needs to be given with -u or --user.

An example for a full create command would be:

java -jar FAIRY-1.0.0-jar-with-dependencies.jar create -f ..\DatasetMetadataFile.tsv -u username

With this command, FAIRY will create landing pages with the information specified in the DatasetMetadataFile.tsv file for each dataset represented in this file.

Input file

The file that needs to be provided with the -f or --file option currently needs to be in TSV format. The property that definitely needs to be provided for the tool to work is identifier. This property should uniquely identify the corresponding dataset in this context. Further properties that are currently supported by FAIRY and should at least be added to provide rich metadata for the datasets are:

  • description : Text, at least 50 characters
  • name : Text
  • license : URL
  • keywords : Text, divided by comma
  • creator : Text
  • measurementTechnique : Text
  • dateCreated : Date

These properties will be represented in the landing page markup in their expected types. Other properties describing datasets can also be provided, but they will only be represented as type text in the markup.

The TSV-file needs to have the Schema.org property names in the first line. Every following line represents the metadata for one dataset. An example for such a file can be found here: Example for a TSV metadata file

Additional Information

Proof-of-concept Implementation

This tool is currently a prototype that was created as part of a bachelor thesis. Therefore, it does only have functionalities needed to make the tool work and to proof the concept that was worked out in the thesis and not any further.

License

FAIRY can be used under the MIT license (MIT).
Other frameworks used for this tool have the following licenses:

fairy-data-provider's People

Contributors

aline-9 avatar sven1103 avatar

Stargazers

Simon Heumos avatar

Watchers

James Cloos avatar  avatar Tobias Koch avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.