Managing datasets in Omeka-S

This howto provides the means to manage dataset descriptions in Omeka-S as specified in Requirements for datasets. It's also the description of the implementation for datasets for the Gouda Timemachine.

Configuration

The dataset descriptions use the schema.org vocabulairy. Importing the Vocabulairy Definition Files from schema.org is a prerequisite.

Some properties of a dataset are based on a custom vocabulairy which requires the Custom vocabs module:

The resource templates necessary for managing datasets are:

Data entry

Because the resources are linked as Omeka items, the order of data entry is bottom-up.

So first add the organization which are to be used in the dataset(s).
Then per dataset, make entries for the distribution.
Next, the datasets can be described, linking them to distributions (Items) and organizations (Items).
Finally, a data catalog can be made, this includes links to all datasets (Items).

Datadump

To make a datadump of the data catalog, including all datasets, distributions and organizations, a simple crawler is provided. This PHP based crawler uses the EasyRDF to collect all resources via the Omeka-S API. Besides fetching all (sub-)resources, all Omeka-S properties and classes are removed.

Publishing

The datadumps (in Turtle, N-Triples, JSON-LD and RDF/XML) are store in the files directory of Omeka-S which makes the file available for download.

Findability

A way to make the dataset description findable is to make use of the Well-Known Path Prefix /.well-known/datacatalog

This can be configured on webserver level (see Apache example for Gouda Timemachine), including conten-negotiation. The result is a redirect in your webbrowser from /.well-known/datacatalog to the Item page of the data catalog.

When providing an Accept header, a Turtle, N-Triples, JSON-LD or RDF/XML can be retrieved. Example:

curl -L -H "Accept: text/turtle" https://www.goudatijdmachine.nl/.well-known/datacatalog

What's missing

Omeka-S item HTML page with a complete dataset description (in JSON-LD script block), instead of the default Omeka-S "shallow" version (no sub-resources).
Use of organizational URI strategy/clean URL module, currently only "Omeka-S API URIs".

coret / datasets-in-omeka-s Goto Github PK

datasets-in-omeka-s's Introduction

Managing datasets in Omeka-S

Configuration

Data entry

Datadump

Publishing

Findability

What's missing

datasets-in-omeka-s's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent