Code Monkey home page Code Monkey logo

sams-fedora-messaging's Introduction

sams-fedora-messaging

Data pipelines driven by Fedora's asynchronous messages

installation

The messaging pipeline assumes that Fedora and Fuseki are both installed, that Fedora's Java Messaging Service is enabled, and that Fuseki is configured to have a SPARQL 1.1 Graph Store Protocol endpoint available at http://localhost:8080/fuseki/prov/data and a SPARQL Query Protocol endpoint at http://localhost:8080/fuseki/prov/query

Check out the repository

$ git clone https://github.com/PublicRecordOfficeVictoria/sams-fedora-messaging.git

Install the csv2rdf Ruby gem, and verify it's installed

$ gem install specific_install
$ sudo gem specific_install -l https://github.com/theodi/csv2rdf
$ csv2rdf help
Usage:
  csv2rdf myfile.csv OR csv2json http://example.com/myfile.csv

Options:
  d, [--dump-errors], [--no-dump-errors]  # Pretty print error and warning objects.
  s, [--schema=FILENAME OR URL]           # Schema file
  v, [--validate], [--no-validate]        # Validate as well as transform
  f, [--full], [--no-full]                # Get full output rather than minimal output

Supports converting CSV files to JSON

Configure the system to run the script sams-fedora-messaging/start-update-handler.sh at boot time.

Testing

Edit the files sample/create-container.sh and sample/ingest-csv-and-metadata.sh to set the correct domain name for the server. Run the first script to create the "sources/trains" folder, and the second script to upload the sample CSV file and its associated CSV Metadata file. Each time the second script is run, the CSV file should be reconverted to RDF and stored in the SPARQL Graph Store.

sams-fedora-messaging's People

Contributors

conal-tuohy avatar danielwilksch avatar

Stargazers

 avatar

Watchers

 avatar  avatar

sams-fedora-messaging's Issues

Properly delete graphs when resources are deleted from Fedora

Currently when a binary resource is created, the handler script is notified, and it creates or updates one or more corresponding resources in the graph store.

  • If the Fedora resource was RDF, it is retrieved and stored in the graph store with the same URI.
  • If the Fedora resource was a binary resource, then its metadata resource (whose URI is the same as the URI of the resource, with '/fcr:metadata' appended) is retrieved and stored in the graph store which the same URI (i.e. ending in '/fcr:metadata`).
    • If the binary resource is CSV then the graph store is queried to find a linked CSV Metadata resource, then RDF is generated using that CSVM resource and the result is stored in the graph store under the CSVM resource's URI.
    • If the binary resource is CSV Metadata then RDF is generated using that CSVM resource and the result is stored in the graph store under the CSVM resource's URI.
    • Other binary resources (JPG, PDF, etc) don't currently produce an RDF graph in the graph store.

Currently, when the handler script receives notification from Fedora that a resource has been deleted, it always asks Fuseki to delete the Named Graph whose name is the URI of the deleted resource. However,

  • there may not be a corresponding graph to delete (e.g. for a PDF file)
  • the fcr:metadata resources are not deleted

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.