Code Monkey home page Code Monkey logo

anuzzolese / pyrml Goto Github PK

View Code? Open in Web Editor NEW
30.0 4.0 9.0 360 KB

pyRML is a Python based engine for processing RML files. The RDF Mapping Language (RML) is a mapping language defined to express customized mapping rules from heterogeneous data structures and serializations to the RDF data model. RML is defined as a superset of the W3C-standardized mapping language R2RML, aiming to extend its applicability and broaden its scope, adding support for data in other structured formats.

License: Apache License 2.0

Python 100.00%
rml rml-mapping rml-mapper rdf csv2rdf

pyrml's Introduction

pyRML

pyRML is a Python based engine for processing RML files. The RDF Mapping Language (RML) is a mapping language defined to express customized mapping rules from heterogeneous data structures and serializations to the RDF data model. RML is defined as a superset of the W3C-standardized mapping language R2RML, aiming to extend its applicability and broaden its scope, adding support for data in other structured formats.

Installation

pyRML requires Python 3. Once the source code has been downladed it is possible to install the Python package by means of pip. For example:

pip install .

Alternatively, it is possible to install the pyRML package directly from GitHub in the following way:

pip install git+https://github.com/anuzzolese/pyrml

Usage

It is possible to use pyRML either by means of its API or the command line tool that is provided along with the source package.

API

The RMLConverter is the key class of pyRML. It accepts the path to an RML file as input and return an RDF graph as output. The output graph is an instance of the class Graph provided by RDFLib.

from pyrml import RMLConverter
import os

# Create an instance of the class RMLConverter.
rml_converter = RMLConverter()

'''
Invoke the method convert on the instance of class RMLConverter by:
 - using the file examples/artist/artist-map.ttl (see the examples in this repo);
 - obtaining an RDF graph as output.
'''
rml_file_path = os.path.join('examples', 'artists', 'artist-map.ttl')
rdf_graph = rml_converter.convert(rml_file_path)

# Print the triples contained into the RDF graph.
for s,p,o in rdf_graph:
    print(s, p, o)
Command line tool

The command line tool is implemented by the script converter.py. Such a script can be used in the followint way:

python converter.py [-o RDF out file] [-f RDF out file] [-m] input

where:

  • the positional argument input is the input RML mapping file for enabling the RDF conversion;
  • the optional arument -o filename is the file to store the resulting RDF graph. If no choice is provided then standard output is assumed as default.
  • the optional argunent -f rdf-syntax can be used to specify the syntax to serialise the RDF graph. Possible values are n3, nquads, nt, pretty-xml, trig, trix, turtle, and xml. If no choice is provided then NTRIPLES is assumed as default.
  • the optional flag -m enebles the conversion based on multiproccessing for speeding up the transformation process.

The following is an example about how to use the command line tool for processing the RML file available in examples/artists/artist-map.ttl, thus converting the CSV files examples/artists/Artist.csv and examples/artists/Place.csv into an RDF graph serialised as TURTLE and stored into the file named artists_places.ttl.

python converter.py -o artists_places.ttl -f turtle examples/artists/artist-map.ttl

pyrml's People

Contributors

anuzzolese avatar rjgladish avatar sgiulio70 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pyrml's Issues

Install issue

Hey !

A lightweight rml mapper in python, that ought to be tested ๐Ÿ˜ƒ

But, the installation hardly succeed due to what it seems to be recent deprecations of soft_unicode

Issue description

Can't import RMLConverter in fresh new conda 3.10 env

Steps to reproduce the issue

  1. conda create --name pyrml310 python=3.10
  2. conda activate pyrml310
  3. pip install git+https://github.com/anuzzolese/pyrml
  4. in a new py file, :
from pyrml import RMLConverter
import os

# Create an instance of the class RMLConverter.
rml_converter = RMLConverter()

What's the actual result?

Error output :

Traceback (most recent call last):
  File "c:\_Archi\Etudes\21ASTRA_\4b-Database-masonry\0-Essais\rml.py", line 1, in <module>
    from pyrml import RMLConverter
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\pyrml\__init__.py", line 1, in <module>
    from pyrml.pyrml import *
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\pyrml\pyrml.py", line 34, in <module>
    from jinja2 import Environment, FileSystemLoader, Template
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\jinja2\__init__.py", line 12, in <module>
    from .environment import Environment
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\jinja2\environment.py", line 25, in <module>
    from .defaults import BLOCK_END_STRING
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\jinja2\defaults.py", line 3, in <module>
    from .filters import FILTERS as DEFAULT_FILTERS  # noqa: F401
  File "C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\jinja2\filters.py", line 13, in <module>
    from markupsafe import soft_unicode
ImportError: cannot import name 'soft_unicode' from 'markupsafe' (C:\Users\antgr\anaconda3\envs\pyrml310\lib\site-packages\markupsafe\__init__.py)

Cheers !

Error: 'NoneType' object is not iterable

File "/Users/user_name/path/to/pyrml/pyrml_core.py", line 1280, in apply return np.array([Function(row).evaluate() for row in pom_matrix], dtype=Function) TypeError: 'NoneType' object is not iterable

Input file path and type not abstracted from rml mapping

Currently the input file can't be parameterised via cli or api. It is hardcoded into the mapping file. Eg:

rml:logicalSource [ 
    rml:source "./examples/artists/Artist.csv" ;
    rml:referenceFormulation ql:CSV
  ]

It would be more flexible to be able to provide this as a parameter.

change the name of the package and make a Release.

Hi, thanks for your work. I would like to use it in a project, because this repo is the access to RML when using python.

But there is no Release, there is no pypi upload. So one can only install an unstable branch directly from github.
Also there cannot be a pypi upload because there is already a package with this name:
https://pypi.org/project/pyrml/

So I would suggestion you consider a name change, prior a Release.

Beste regards

Error in Importing RMLConverter

There is error in import due to which its not possible to run anything.

Traceback (most recent call last):
  File "converter.py", line 2, in <module>
    from pyrml import RMLConverter
  File "/pyrml/pyrml/__init__.py", line 1, in <module>
    from pyrml.pyrml import *
  File "/pyrml/pyrml/pyrml.py", line 52
    def __init__(self, map_id: URIRef = None):
                             ^
SyntaxError: invalid syntax

Can you explain, how to fix ?

Foreign key resolution with CSV

I noticed an issue with mapping CSVs, where if different CSV files use the same column name as primary key, then the foreign key resolution fails.

To reproduce, simply rename the CODE column in the artist example to ID, in both the CSV and artist-map.ttl.

The output becomes:

ns1:rna29 a ns1:Person ;
    ns1:birth_date "1929-12-06" ;
    ns1:birth_place ns1:rna29 ;
    ns1:fullName "Ronald Anderson" .

ns1:rtm19 a ns1:Person ;
    ns1:birth_date "1919-12-23" ;
    ns1:birth_place ns1:rtm19 ;
    ns1:fullName "Robert Theodore McCall" .

As you can see, the birth_place objects are incorrect, as they refer to the artists themselves, instead of the places.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.