maastrichtu-ids / into-the-graph Goto Github PK

View Code? Open in Web Editor NEW

13.0 17.0 4.0 25.92 MB

🧭 Resolve, visualize and browse the content of any SPARQL endpoint

Home Page: https://maastrichtu-ids.github.io/into-the-graph

License: MIT License

Dockerfile 0.34% JavaScript 0.10% CSS 0.55% TypeScript 99.01%

sparql-endpoints knowledge-graph knowledge-discovery linked-data linked-open-data

into-the-graph's Introduction

🧭 Lightweight linked data browser

Lightweight and fast RDF browser that just need a SPARQL endpoint URL to give a comfortable experience when exploring differents graphs.

Browse a RDF triplestore by providing the SPARQL endpoint URL directly in the browser.

See an example deployment of into-the graph at https://maastrichtu-ids.github.io/into-the-graph.

This RDF linked data browser features:

A web-based UI to browse SPARQL endpoints content easily. Stateful URL to resolve a specific URI in a specific SPARQL endpoint can be defined using the uri and endpoint parameters. Tested with RDF4J (Ontotext GraphDB) and Virtuoso SPARQL endpoints.
Easily search for concepts in the triplestore. Possibility to change the SPARQL query to define the custom query to use the Search index of different triplestores in settings (Ontotext GraphDB and Virtuoso triplestores documented).
Work in progress: insights about the content of the triplestore and its different graphs, using precomputed HCLS descriptives statistics.

HCLS descriptive statistics for a graph can simply be computed and inserted running a docker run command. Follow those instructions to run it. Or check the fair-metadata Python library in development.

Into-the-graph is built with TypeScript , ReactJS, and Material-UI to serve and explore RDF data from any SPARQL endpoint.

This service has been developed and used as part of the Data2Services framework. Data2Services provides tools and guideline to easily integrate multiple structured data sources (CSV, RDB, XML) to a RDF knowledge graph, complying with a defined data model. Checkout the documentation at d2s.semanticscience.org

👨‍💻 Contribute

Contributions are welcome! See the guidelines to contribute.

🎬 Into-the-graph in action

⚠️ Trying to query a HTTP SPARQL endpoint from into-the-graph which is provided through HTTPS might cause issues.

Start browsing various SPARQL endpoints using into-the-graph in one click:

A Clinical trial in Bio2RDF:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://bio2rdf.org/clinicaltrials:NCT00209495&endpoint=https://bio2rdf.org/sparql
A Pathway in PathwayCommons:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://identifiers.org/reactome/R-HSA-8852135&endpoint=http://rdf.pathwaycommons.org/sparql/
A Protein in NextProt:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://nextprot.org/rdf/entry/NX_Q96Q91&endpoint=https://sparql.nextprot.org
A Gene-Disease association in the DisGeNET SPARQL endpoint:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://rdf.disgenet.org/resource/gda/DGN06012220986003d9ecac664f0865140b&endpoint=http://rdf.disgenet.org/sparql/
A Protein (UniProt) in AgroLD:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://purl.uniprot.org/uniprot/M7Y493&endpoint=http://sparql.southgreen.fr
A City in DBpedia
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://dbpedia.org/resource/Menton&endpoint=http://dbpedia.org/sparql
A Citation in the EU Law OpenCitation corpus
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=https://w3id.org/oc/index/coci/ci/020010000073609070863036303010963090209070963084905-02001000007362800000401006300010363000806006334&endpoint=http://opencitations.net/index/sparql
A Publication in the EU Cellar Law dataset:
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://publications.europa.eu/resource/cellar/c721f802-9ce7-11e7-b92d-01aa75ed71a1&endpoint=http://publications.europa.eu/webapi/rdf/sparql
- Browsing not really good due to the use of graphs for entities.
A City in the LOD SPARQL endpoint
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://www.wikidata.org/entity/Q180083&endpoint=http://lod.openlinksw.com/sparql
- Browsing not really good due to the use of graphs for entities.
A Dataset in OpenEuropa Joinup SPARQL
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://data.europa.eu/w21/dfba1169-806f-4c9e-a42e-a5c5830a2221&endpoint=https://joinup.ec.europa.eu/sparql/
A Statistic in the EU Open Data Portal
- http://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://data.lod2.eu/scoreboard/indicators/FOA_cit_Country__of_pub_serv_for_citizen&endpoint=http://data.europa.eu/euodp/sparqlep

You can even directly use http://maastrichtu-ids.github.io/into-the-graph to browse a locally deployed endpoint! e.g. http://localhost:8890/sparql

🏗️ Run in development

Requirements: npm and yarn installed.

Clone the repository:

git clone https://github.com/MaastrichtU-IDS/into-the-graph
cd into-the-graph

Install dependencies 📥

yarn

Run the web app in development at http://localhost:19006

yarn dev

The website should reload automatically at each changes to the code 🔃

Upgrade the packages versions in yarn.lock

yarn upgrade perfect-graph unitx-ui unitx

🚀 Run in production

This website is automatically deployed by a GitHub Actions worklow to GitHub Pages at https://maastrichtu-ids.github.io/into-the-graph

You can build locally in /web-build folder and serve on http://localhost:5000 📦

yarn build
yarn serve

Or run directly using Docker 🐳 (requires docker installed)

docker-compose up

Checkout the docker-compose.yml file to see how we run the Docker image.

🔎 Search queries

Optimized SPARQL query to perform full text search in different triplestores.

Generic

Support the BioLink model (get rdfs:label and bl:name)

SELECT ?foundUri ?foundLabel WHERE {?foundUri ?p ?foundLabel . VALUES ?p {<http://www.w3.org/2000/01/rdf-schema#label> <https://w3id.org/biolink/vocab/name>} . FILTER(isLiteral(?foundLabel)) FILTER contains(?foundLabel, '$TEXT_TO_SEARCH')} LIMIT 5

GraphDB

See GraphDB full text search documentation to create a search index (uses Apache Lucene under the hood).

PREFIX luc: <http://www.ontotext.com/owlim/lucene#> SELECT ?foundUri ?foundLabel { ?foundLabel luc:searchIndex '$TEXT_TO_SEARCH*' . ?foundUri ?p ?foundLabel . } LIMIT 100

Order by Lucene score

PREFIX luc: <http://www.ontotext.com/owlim/lucene#> SELECT ?foundUri ?foundLabel { ?foundLabel luc:searchIndex '$TEXT_TO_SEARCH*' ; luc:score ?score . ?foundUri ?p ?foundLabel . } ORDER BY ?score LIMIT 100

DBpedia Virtuoso

SELECT ?foundUri ?foundLabel WHERE {?foundUri <http://www.w3.org/2000/01/rdf-schema#label> ?foundLabel . ?foundLabel bif:contains '$TEXT_TO_SEARCH' . } LIMIT 200

OpenCitation

SELECT ?foundUri ?foundLabel WHERE {?foundUri ?p ?foundLabel . VALUES ?p {<http://purl.org/spar/cito/hasCitationCreationDate> <http://purl.org/spar/cito/hasCitationTimeSpan>} . FILTER(str(?foundLabel) =  '$TEXT_TO_SEARCH')} LIMIT 5

Text operations really slow on OpenCitation SPARQL, so we do a full match (for date mainly).

into-the-graph's People

Contributors

Stargazers

Watchers

Forkers

00mjk vemonet iml4e variousforks

into-the-graph's Issues

Change HCLS overview SPARQL queries from settings.json

To allow adapt the graph query especially (which is cached in Virtuoso)

Look into Loupe Linked Data

Explore metrics about RDF datasets in web UI: http://loupe.linkeddata.es/loupe/index.jsp

Use Virtuoso and ElasticSearch

Ontoloy github: https://github.com/nandana/loupe-ontology
Paper: http://ceur-ws.org/Vol-1824/ldq_paper_1.pdf
API github: https://github.com/oeg-upm/loupe-api

The API is down: http://api.loupe.linkeddata.es/

API down + last commit in 2017

Try contact [email protected] for the website source code

Improve tabs display

At the moment we use: https://github.com/reactjs/react-tabs
Should we improve its CSS?

Or switch for material-ui tabs?

Example: https://bit.dev/mui-org/material-ui/tabs

Much heavier to implement

Use React Native UI

https://callstack.github.io/react-native-paper

https://react-native-elements.github.io/react-native-elements

Add full text search

Open a paper with list of results in overlay:

https://stackoverflow.com/questions/42130822/display-a-overlay-when-input-is-clicked-in-react

Using Graph not supported by Wikidata endpoint

into-the-graph fail querying Wikidata endpoint due to the GRAPH ?graph { part, not supported in Blazegraph triple mode.

See https://query.wikidata.org/sparql

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE {
  graph ?g {
  <http://www.wikidata.org/entity/Q45> ?pred ?obj .
  } } LIMIT 10

Add Comunica Widget

For temporal SPARQL query over HDT (Memento) and GraphQL-LD

http://query.linkeddatafragments.org/
https://github.com/comunica/Tutorial-Comunica-Querying-Data/wiki/Comunica-tutorial:-Querying-Data#5-setting-up-a-web-client

Improve datatables

The actual data table lib is quite good, but not optimized for React and the display get out of the grid cell

MD bootstrap data tables do multi sort and are easy to deploy:

https://mdbootstrap.com/docs/react/tables/datatables/

Expore LDP Graphs

Virtuoso LDP generates a lots of graphs

See http://trek.semanticscience.org/describe?uri=http://w3id.org/FAIR_Training_LDP/DAV/home/LDP/Training/Murcia/marks_resource.ttl&endpoint=http://w3id.org/FAIR_COVID19/sparql/

A good way to navigate LDP for this use-case:

PREFIX ldp: <http://www.w3.org/ns/ldp#> 
PREFIX schema: <http://schema.org/> 
SELECT ?crateContainer WHERE { 
  GRAPH ?crateContainer { 
    ?crateContainer ldp:contains ?crate .
  } 
  GRAPH ?crate { ?s schema:studyDesign "Cohort" .} 
}

Should we have a switch to allow to get all triples in graphs instead of just classes?

Split the SPARQL queries

We need to split the SPARQL query calls to get 500 as subject, 500 as predicate and 500 as object

issue with http://localhost:3000/describe?uri=http://bio2rdf.org/mesh:D001172

Add admin page

With lock icon in navbar and password access

Allow to change:

SPARQL endpoint URL
SPARQL query for full text search
Prefix/namespaces defined. The use can choose between:
- directly add in the file (easy to implement)
- use a central service, such as PrefixCommons (to implement later)

Those variables are stored in the React Context API store

Fix Describe tabs

https://github.com/reactjs/react-tabs
https://reactcommunity.org/react-tabs/

Properly exchange data between components

Tutorial to use the new React Context API store: https://www.toptal.com/react/react-context-api

Most complete tutorial, with all solutions (classic, redux, context API): https://towardsdatascience.com/passing-data-between-react-components-parent-children-siblings-a64f89e24ecf

Clean tutorial without using Redux: https://medium.com/@ruthmpardee/passing-data-between-react-components-103ad82ebd17

Fix issue with describe not reloading when URI param changes at the same time

YASGUI results show redirect to our resolution service on into-the-graph

So that users can browse those URIs directly on the browser

The Yasr plugin might be interesting: https://triply.cc/docs/yasgui-api#writing-a-yasr-plugin

Store data as RDF

The data pulled could be stored as RDF, then the visualization would be built directly using rdflib

See documentation: https://zazuko.com/get-started/developers/

Describe page: allow to display described statement in a datatable

We should be able to click on a switch button to switch between the graphical representation of triples (current) and just a datatable with 4 columns: subject, predicate, object, graph

The user can them filter and sort the triples as he wants

Need to fix CORS error?

On http://ld-browser.137.120.31.101.nip.io/
CORS issues only on icons

axios/axios#853

Fix show more statements button

Subjects:
http://localhost:3000/describe?uri=http://bio2rdf.org/drugbank:DB00001

Object:
http://localhost:3000/describe?uri=http://bio2rdf.org/drugbank_resource:FASTA

Query to get all graphs not cached on GraphDB

In GraphsOverviews.js

SELECT DISTINCT ?g WHERE { GRAPH ?g {?s ?p ?o} }

Works really fast on Virtuoso, but not GraphDB

Improve display to change SPARQL endpoint

This was a easy way to fix it, but could be improved later

I was thinking about putting the current SPARQL endpoint on the top navbar right

Hum, and a Autocomplete directly at this place
So the top bar would be:

which URI you describe
In which SPARQL endpoint

If it is too packed, I can just put under the URI in current UI

And add the dropdown for graph query in graph overview?

It will be better to refactor a bit to define a specific component for our different Autocomplete

Better SPARQL: improve resolution of graphs URI

At the moment we try to get only the classes of the graph

A few issues when resolving a graph URI:

Not all classes are always retrieved on big graph: the query needs to be optimized. But this kind of query can be quite expensive to run, so we are using limit at the moment
If the graph has no classes then the UI shows nothing (add fallback query which get any statement?)

Where to store HCLS metadata

Use HCLS metadata from the SPARQL endpoint when present

Also provide user with the option to provide HCLS metadata by uploading a RDF turtle file

https://maastrichtu-ids.github.io/into-the-graph/describe?uri=http://bio2rdf.org/clinicaltrials:NCT00209495

npm WARN deprecated [email protected]: Deprecated. Please use https://github.com/webpack-contrib/mini-css-extract-plugin
npm WARN deprecated [email protected]: Way too old
npm WARN deprecated [email protected]: core-js@<3.0 is no longer maintained and not recommended for usage due to the number of issues. Please, upgrade your dependencies to the actual version of core-js@3.
npm WARN deprecated [email protected]: cross-spawn no longer requires a build toolchain, use it instead
npm WARN deprecated [email protected]: Browserslist 2 could fail on reading Browserslist >3.0 config used in other tools.
npm WARN deprecated [email protected]: This module moved to @hapi/hawk. Please make sure to switch over as this distribution is no longer supported and may contain bugs and critical security issues.
npm WARN deprecated [email protected]: CircularJSON is in maintenance only, flatted is its successor.
npm WARN deprecated [email protected]: One of your dependencies needs to upgrade to fsevents v2: 1) Proper nodejs v10+ support 2) No more fetching binaries from AWS, smaller package size
npm WARN deprecated [email protected]: This version has been deprecated in accordance with the hapi support policy (hapi.im/support). Please upgrade to the latest version to get the best features, bug fixes, and security patches. If you are unable to upgrade at this time, paid support is available for older versions (hapi.im/commercial).
npm WARN deprecated [email protected]: This version has been deprecated in accordance with the hapi support policy (hapi.im/support). Please upgrade to the latest version to get the best features, bug fixes, and security patches. If you are unable to upgrade at this time, paid support is available for older versions (hapi.im/commercial).
npm WARN deprecated [email protected]: This module moved to @hapi/sntp. Please make sure to switch over as this distribution is no longer supported and may contain bugs and critical security issues.
npm WARN deprecated [email protected]: This version has been deprecated in accordance with the hapi support policy (hapi.im/support). Please upgrade to the latest version to get the best features, bug fixes, and security patches. If you are unable to upgrade at this time, paid support is available for older versions (hapi.im/commercial).

Check and add more SPARQL endpoints

List here: https://github.com/OpenLinkSoftware/general-turtle-doc-collection/blob/master/LODCloud_SPARQL_Endpoints.ttl

We could also parse it automatically

Biomedical SPARQL: https://yummydata.org/

Improve the Search feature

See the search query

At the moment we only do a simple search via SPARQL using contains on label, but still really slow for big datasets such as ncats-red-kg
A customize search query have been written for GraphDB using its homemade search index
Virtuoso and Blazegraph also allows to index statements

We should be able to easily switch from a search query to another, through an Admin UI?
See #8

Alternatives Linked Data browsers

LODEstar

SPARQL query and URI resolution, available through DockerHub.

docker run --rm -d --name lodestar -p 8082:8080 \
  -e ENDPOINT_URL=https://graphdb.dumontierlab.com/repositories/ncats-red-kg \
  -e TOP_RELATIONSHIP=http://w3id.org/biolink/vocab/id,http://w3id.org/biolink/vocab/name,http://w3id.org/biolink/vocab/description \
  -e LABEL=http://w3id.org/biolink/vocab/label \
  -e DESCRIPTION=http://w3id.org/biolink/vocab/description \
  -e MAX_OBJECTS=10 \
  -e SERVICE_BASE_URI=http://localhost:8080/ncats-red-kg netresearch/lodestar

Access at http://localhost:8082/lodestar/sparql

Does not support graphs 🚫

Trifid

Linked Data Server: URI dereferencing, custom HTML render, YASGUI SPARQL endpoint.

git clone https://github.com/vemonet/trifid.git
docker build -t trifid ./trifid

docker run --rm -ti --name trifid -p 8080:8080 trifid --sparql-endpoint-url=https://graphdb.dumontierlab.com/repositories/test --dataset-base-url=https://w3id.org/d2s/

docker run --rm -ti --name trifid -v /home/vemonet/sandbox/trifid:/data -p 8080:8080 trifid --config=/data/config-ncats-red-kg.json

Go to http://localhost:8080/dataset/huri/ to resolve https://w3id.org/d2s/dataset/huri/

Modified version on GitHub.

Original project available on DockerHub. But config not working.

docker run -ti -p 8080:8080 zazuko/trifid
# Not working, provide env config file?
docker run -ti -p 8080:8080 -e TRIFID_CONFIG=config-ncats-red-kg.json zazuko/trifid
docker run -ti -p 8080:8080 -e SPARQL_ENDPOINT_URL=https://graphdb.dumontierlab.com/repositories/test -e DATASET_BASE_URL=https://w3id.org/d2s/ zazuko/trifid

Access default example on http://localhost:8080/data/person/mary-cooper to resolve URI.

Does not support graphs 🚫

brwsr

Lightweight Linked Data Browser.

git clone https://github.com/Data2Semantics/brwsr.git
docker-compose up

Go to http://localhost:5000.

Change the SPARQL endpoint in the docker-compose.yml.

Does not support graphs 🚫

RhizomerEye

RhizomerEye is a tool to expose a SPARQL endpoint as REST API and deploy a Web UI to browse the triplestore.

See the source code for the RhizomerAPI and RhizomerEye.

The Web UI has been deployed publicly for a few triplestores:

TriplyDB

See official documentation. It allows to deploy the following services over a triplestore:

YASGUI SPARQL endpoint
Search index using ElasticSearch.
Web UI to resolve and browse a triplestore
Supports graphs in the TPF table browser (not in the browser)

TriplyDB is hosted centrally and cannot be deployed locally 🚫

into-the-graph: add more overview tables for HCLS RDF stats

Add more tables to describe a dataset:

Dataset-Property-Dataset List
Property and Object Type
Same Type-Property-Type List as theirs (with unique and not unique)

Propose tables similar to what is proposed in Bio2RDF: http://download.bio2rdf.org/files/release/3/affymetrix/affymetrix.html

Deploying as HTTPS prevents querying HTTP endpoints

Using YASGUI widget to query http://dbpedia.org/sparql:

You are trying to query an HTTP endpoint (http://dbpedia.org/sparql) from an HTTPS website (https://trek.semanticscience.org/sparql).
This is not allowed in modern browsers, see https://developer.mozilla.org/en-US/docs/Web/Security

maastrichtu-ids / into-the-graph Goto Github PK

into-the-graph's Introduction

🧭 Lightweight linked data browser

👨‍💻 Contribute

🎬 Into-the-graph in action

🏗️ Run in development

🚀 Run in production

🔎 Search queries

Generic

GraphDB

DBpedia Virtuoso

OpenCitation

into-the-graph's People

Contributors

Stargazers

Watchers

Forkers

into-the-graph's Issues

Links

LODEstar

Trifid

brwsr

RhizomerEye

TriplyDB

Recommend Projects

Recommend Topics

Recommend Org