clariah / grlc Goto Github PK

grlc builds Web APIs using shared SPARQL queries

License: MIT License

Python 80.23% HTML 5.57% Shell 3.73% CSS 1.92% JavaScript 2.56% Dockerfile 1.03% TeX 4.96%

swagger-ui sparql linked-data semantic-web api linked-data-api open-api

grlc's Introduction

grlc, the git repository linked data API constructor, automatically builds Web APIs using shared SPARQL queries. http://grlc.io/

If you use grlc in your work, please cite it as:

@InProceedings{merono2016grlc,
 author = {Mero{\~{n}}o-Pe{\~{n}}uela, Albert and Hoekstra, Rinke},
 title = {{grlc Makes GitHub Taste Like Linked Data APIs}},
 booktitle = {The Semantic Web: ESWC 2016 Satellite Events, Heraklion, Crete, Greece, May 29 -- June 2,  2016},
 year = {2016},
 publisher = {Springer},
 pages = {342--353},
 isbn = {978-3-319-47602-5},
 doi = {10.1007/978-3-319-47602-5_48}
}

What is grlc?

grlc is a lightweight server that takes SPARQL queries (stored in a GitHub or GitLab repository, in your local filesystem, or listed in a URL), and translates them to Linked Data Web APIs. This enables universal access to Linked Data. Users are not required to know SPARQL to query their data, but instead can access a web API.

Quick tutorial

For a quick usage tutorial check out our wiki walkthrough and list of features.

Usage

grlc assumes that you have a collection of SPARQL queries as .rq files (like this). grlc will create one API operation for each SPARQL query/.rq file in the collection.

Your queries can add API parameters to each operation by using the parameter mapping syntax. This allows your query to define query variables which will be mapped to API parameters for your API operation (see here for an example).

Your queries can include special decorators to add extra functionality to your API.

Query location

grlc can load your query collection from different locations: from a GitHub repository (api-git), from a GitLab repository (api-gitlab), from local storage (api-local), and from a specification file (api-url). Each type of location has specific features and is accessible via different paths. However all location types produce the same beautiful APIs.

From a GitHub repository

API path: http://grlc-server/api-git/<user>/<repo>

grlc can build an API from any Github repository, specified by the GitHub user name of the owner (<user>) and repository name (<repo>).

For example, assuming your queries are stored on a Github repo: https://github.com/CLARIAH/grlc-queries/, point your browser to the following location http://grlc.io/api-git/CLARIAH/grlc-queries/

grlc can make use of git's version control mechanism to generate an API based on a specific version of queries in the repository. This can be done by including the commit sha in the URL path (http://grlc-server/api-git/<user>/<repo>/commit/<sha>), for example: http://grlc.io/api-git/CLARIAH/grlc-queries/commit/79ceef2ee814a12e2ec572ffaa2f8212a22bae23

grlc can also use a subdirectory inside your Github repo. This can be done by including a subdirectory in the URL path (http://grlc-server/api-git/<user>/<repo>/subdir/<subdir>).

From a GitLab repository

API path: http://grlc-server/api-gitlab/<user>/<repo>

grlc can build an API from any GitLab repository, specified by the GitLab user name of the owner (<user>) and repository name (<repo>).

For example, assuming your queries are stored on a GitLab repo: https://gitlab.com/c-martinez/grlc-queries, point your browser to the following location http://grlc.io/api-gitlab/c-martinez/grlc-queries/

grlc can make use of git's version control mechanism to generate an API based on a specific version of queries in the repository. This can be done by including the name of a branch in the URL path (http://grlc-server/api-gitlab/<user>/<repo>/branch/<branch>), for example: http://grlc.io/api-gitlab/c-martinez/grlc-queries/branch/master

grlc can also use a subdirectory inside your GitLab repo. This can be done by including a subdirectory in the URL path (http://grlc-server/api-gitlab/<user>/<repo>/subdir/<subdir>), for example: http://grlc-server/api-gitlab/c-martinez/grlc-queries/subdir/subdir.

From local storage

API path: http://grlc-server/api-local/

grlc can generate an API from a local directory in the computer where your grlc server runs. You can configure the location of this directory in your grlc server configuration file. See also how to install and run your own grlc instance.

When the API is generated from a local directory, API information can be loaded from a configuration file in that folder. This file must be called local-api-config.ini and it has the following format:

[repo_info]
repo_title = Some title
api_description = Description of my API
contact_name = My name
contact_url = https://mypage/
licence_url = https://mylicence/

From a specification file

API path: http://grlc-server/api-url/?specUrl=<specUrl>

grlc can generate an API from a yaml specification file accessible on the web.

For example, assuming your queries are listed on spec file: https://raw.githubusercontent.com/CLARIAH/grlc-queries/master/urls.yml, point your browser to the following location http://grlc.io/api-url?specUrl=https://raw.githubusercontent.com/CLARIAH/grlc-queries/master/urls.yml

Specification file syntax

A grlc API specification file is a YAML file which includes the necessary information to create a grlc API, most importantly a list of URLs to decorated and HTTP-dereferenceable SPARQL queries. This file should contain the following fields

title: Title of my API
description: API description
contact: Contact details of the API owner. This should include the name and url properties.
licence: A URL pointing to the licence file for the API.
queries: A list of URLs of SPARQL queries (with header decorators). Alternatively a query can be defined as a dictionary with a name and a url.

For example:

title: Title of my API
description: Description of my API
contact:
  name: Contact Name
  url: https://www.mywebsite.org
licence: http://example.org/licence.html
queries:
  - https://www.mywebsite.org/query1.rq
  - https://www.mywebsite.org/query2.rq
  - https://www.otherwebsite.org/query3.rq
  - name: QueryFour
    url: https://www.mywebsite.org/query4.rq

grlc generated API

The API paths of all location types point to the generated swagger-ui style API documentation. On the API documentation page, you can explore available API calls and execute individual API calls.

You can also view the swagger spec of your API, by visiting <API-path>/swagger, for example: http://grlc.io/api-git/CLARIAH/grlc-queries/swagger

grlc query execution

When you call an API endpoint, grlc executes the SPARQL query for that endpoint by combining supplied parameters and decorators.

There are 4 options to specify your own endpoint:

Add a sparql_endpoint on your config.ini
Add a endpoint parameter to your request: 'http://grlc.io/user/repo/query?endpoint=http://sparql-endpoint/'. You can add a #+ endpoint_in_url: False decorator if you DO NOT want to see the endpoint parameter in the swagger-ui of your API.
Add the #+ endpoint: decorator.
Add the URL of the endpoint on a single line in an endpoint.txt file within the GitHub repository that contains the queries.

The endpoint call will return the result of executing the query as a json representation of rdflib.query.QueryResult (for other result formats, you can use content negotiation via HTTP Accept headers). For json responses, the schema of the response can be modified by using the #+ transform: decorator.

Decorator syntax

Special decorators are available to make your swagger-ui look nicer and to increase functionality. These are provided as comments at the start of your query file, making it still syntactically valid SPARQL. All decorators start with #+ , for example:

#+ decorator_1: decorator value
#+ decorator_1: decorator value

SELECT * WHERE {
  ?s ?p ?o .
}

The following is a list of available decorators and their function:

`summary`

Creates a summary of your query/operation. This is shown next to your operation name in the swagger-ui.

Syntax:

#+ summary: This is the summary of my query/operation

Example query and the equivalent API operation.

`description`

Creates a description of your query/operation. This is shown as the description of your operation in the swagger-ui.

Syntax:

#+ description: Extended description of my query/operation.

Example query and the equivalent API operation.

`endpoint`

Specifies a query-specific endpoint.

Syntax:

#+ endpoint: http://example.com/sparql

Example query and the equivalent API operation.

`pagination`

Paginates the results in groups of (for example) 100. Links to previous, next, first, and last result pages are provided as HTTP response headers to avoid polluting the payload (see details here)

Syntax:

#+ pagination: 100

Example query and the equivalent API operation.

`method`

Indicates the HTTP request method (GET and POST are supported).

Syntax:

#+ method: GET

Example query and the equivalent API operation.

`tags`

Assign tags to your query/operation. Query/operations with the same tag are grouped together in the swagger-ui.

Syntax:

#+ tags:
#+   - firstTag
#+   - secondTag

Example query and the equivalent API operation.

`defaults`

Set the default value in the swagger-ui for a specific parameter in the query.

Syntax:

#+ defaults:
#+   - param_name: default_value

Example query and the equivalent API operation.

`enumerate`

Indicates which parameters of your query/operation should get enumerations (and get dropdown menus in the swagger-ui) using the given values from the SPARQL endpoint. The values for each enumeration variable can also be specified into the query decorators to save endpoint requests and speed up the API generation.

Syntax:

#+ enumerate:
#+   - var1:
#+     - value1
#+     - value2

Example query and the equivalent API operation.

Notice that these should be plain variable names without SPARQL/BASIL conventions (so var1 instead of ?_var1_iri)

`endpoint_in_url`

Allows/disallows the endpoint parameter from being provided as a URL parameter (allowed by default).

Syntax:

#+ endpoint_in_url: False

Example query and the equivalent API operation.

`transform`

Allows query results to be converted to the specified JSON structure, by using SPARQLTransformer syntax. Notice that the response content type must be set to application/json for the transformation to take effect.

Syntax:

#+ transform: {
#+     "key": "?p",
#+     "value": "?o",
#+     "$anchor": "key"
#+   }

Example query and the equivalent API operation.

`endpoint-method`

Allows the query to be sent from the grlc server to the SPARQL endpoint using either GET or POST http method. (Default: POST)

Syntax:

#+ endpoint-method: GET

Example query and the equivalent API operation.

Example APIs

Check these out:

You'll find the sources of these and many more in GitHub

Use this GitHub search to see examples from other grlc users.

Install and run

You can use grlc in different ways:

Via grlc.io: you can use the grlc.io service
Via Docker: you can use the grlc docker image and start your own grlc server.
Via pip: you can install the grlc Python package and start your own grlc server or use grlc as a Python library.

More details for each of these options are given below.

grlc.io

The easiest way to use grlc is by visiting grlc.io and using this service to convert SPARQL queries into a RESTful API. Your queries can be stored on a github repo or can be listed on a specification file.

Docker

To run grlc via docker, you'll need a working installation of docker. To deploy grlc, just pull the latest image from Docker hub. :

docker run -it --rm -p 8088:80 clariah/grlc

The docker image allows you to setup several environment variable such as GRLC_SERVER_NAME GRLC_GITHUB_ACCESS_TOKEN,GRLC_GITLAB_ACCESS_TOKEN and GRLC_SPARQL_ENDPOINT:

docker run -it --rm -p 8088:80 -e GRLC_SERVER_NAME=grlc.io -e GRLC_GITHUB_ACCESS_TOKEN=xxx -e GRLC_GITLAB_ACCESS_TOKEN=yyy -e GRLC_SPARQL_ENDPOINT=http://dbpedia.org/sparql -e DEBUG=true clariah/grlc

Pip

If you want to run grlc locally or use it as a library, you can install grlc on your machine. Grlc is registered in PyPi so you can install it using pip.

Prerequisites

grlc has the following requirements:

Python3
development files (depending on your OS):

sudo apt-get install libevent-dev python-all-dev

pip install

Once the base requirements are satisfied, you can install grlc like this:

pip install grlc

Once grlc is installed, you have several options:

Stand alone server
Using a WSGI server
As a python library

Standalone server

grlc includes a command line tool which you can use to start your own grlc server:

grlc-server

Using a WSGI server

You can run grlc using a WSGI server such as gunicorn as follows:

gunicorn grlc.server:app

If you want to use your own gunicorn configuration, for example gunicorn_config.py:

workers = 5
worker_class = 'gevent'
bind = '0.0.0.0:8088'

Then you can run it as:

gunicorn -c gunicorn_config.py grlc.server:app

Note: Since gunicorn does not work under Windows, you can use waitress instead:

waitress-serve --port=8088 grlc.server:app

If you want to run grlc at system boot as a service, you can find example upstart scripts at upstart/

grlc library

You can use grlc as a library directly from your own python script. See the usage example to find out more.

grlc server configuration

Regardless of how you are running your grlc server, you will need to configure it using the config.ini file. Have a look at the example config file to see how it this file is structured.

The configuration file contains the following variables:

github_access_token access token to communicate with Github API.
gitlab_access_token access token to communicate with GitLab API.
local_sparql_dir local storage directory where local queries are located.
server_name name of the server (e.g. grlc.io)
sparql_endpoint default SPARQL endpoint
user and password SPARQL endpoint default authentication (if required, specify 'none' if not required)
debug enable debug level logging.
gitlab_url to specify the base url of your GitLab instance.

Git access token

In order for grlc to communicate with GitHub and/or GitLab, you'll need to tell grlc what your access token is:

Get a GitHub personal access token or GitLab personal access token.
You'll get an access token string, copy it and save it somewhere safe.
Edit your config.ini (github_access_token and gitlab_access_token respectively) and/or docker-compose.yml (GRLC_GITHUB_ACCESS_TOKEN and GRLC_GITLAB_ACCESS_TOKEN environment variables).

Contribute!

grlc needs you to continue bringing Semantic Web content to developers, applications and users. No matter if you are just a curious user, a developer, or a researcher; there are many ways in which you can contribute:

File in bug reports
Request new features
Set up your own environment and start hacking

Check our contributing guidelines for these and more, and join us today!

If you cannot code, that's no problem! There's still plenty you can contribute:

Share your experience at using grlc in Twitter (mention the handle @grlcldapi)
If you are good with HTML/CSS, let us know

Related tools

SPARQL2Git is a Web interface for editing SPARQL queries and saving them in GitHub as grlc APIs.
grlcR is a package for R that brings Linked Data into your R environment easily through grlc.
Hay's tools lists grlc as a Wikimedia-related tool :-)

This is what grlc users are saying

Flavour your Linked Data with grlc, by Carlos Martinez
Converting any SPARQL endpoint to an OpenAPI by Egon Willighagen

Quotes from grlc users:

A cool project that can convert a random SPARQL endpoint into an OpenAPI endpoint

It enables us to quickly integrate any new API requirements in a matter of seconds, without having to worry about configuration or deployment of the system

You can store your SPARQL queries on GitHub and then you can run your queries on your favourite programming language (Python, Javascript, etc.) using a Web API (including swagger documentation) just as easily as loading data from a web page

Contributors: Albert Meroño, Rinke Hoekstra, Carlos Martínez

Copyright: Albert Meroño, Rinke Hoekstra, Carlos Martínez
License: MIT License (see LICENSE.txt)

Academic publications

Albert Meroño-Peñuela, Carlos Martinez-Ortiz. “grlc: the git repository linked data API constructor.“ Journal of Open Source Software, 6(67), 2731 (2021), https://doi.org/10.21105/joss.02731
Albert Meroño-Peñuela, Pasquale Lisena, Carlos Martínez-Ortiz. “Web Data APIs for Knowledge Graphs: Easing Access to Semantic Data for Application Developers”. Synthesis Lectures on Data, Semantics, and Knowledge, 12(1), pp.1-118 (2021) (Morgan & Claypool) https://doi.org/10.2200/S01114ED1V01Y202107DSK021
Albert Meroño-Peñuela, Rinke Hoekstra. “grlc Makes GitHub Taste Like Linked Data APIs”. The Semantic Web – ESWC 2016 Satellite Events, Heraklion, Crete, Greece, May 29 – June 2, 2016, Revised Selected Papers. LNCS 9989, pp. 342-353 (2016). (PDF)
Albert Meroño-Peñuela, Rinke Hoekstra. “SPARQL2Git: Transparent SPARQL and Linked Data API Curation via Git”. In: Proceedings of the 14th Extended Semantic Web Conference (ESWC 2017), Poster and Demo Track. Portoroz, Slovenia, May 28th – June 1st, 2017 (2017). (PDF)
Albert Meroño-Peñuela, Rinke Hoekstra. “Automatic Query-centric API for Routine Access to Linked Data”. In: The Semantic Web – ISWC 2017, 16th International Semantic Web Conference. Lecture Notes in Computer Science, vol 10587, pp. 334-339 (2017). (PDF)
Pasquale Lisena, Albert Meroño-Peñuela, Tobias Kuhn, Raphaël Troncy. “Easy Web API Development with SPARQL Transformer”. In: The Semantic Web – ISWC 2019, 18th International Semantic Web Conference. Lecture Notes in Computer Science, vol 11779, pp. 454-470 (2019). (PDF)

grlc's People

Contributors

Stargazers

Watchers

grlc's Issues

GitHub OAuth support

Users should be able to "log in" using their GitHub OAuth credentials.
This could be used to guide them through the UI in selecting the repos containing the SPARQL queries (instead of the current method of "manually" editing the browser URL).

Simpler enumeration filling

In the get_enumeration function, you're matching against the triple pattern (s) that contain the variable. Two things go ‘wrong’ here: 1) what if the variable occurs in multiple places? and 2) you are very likely to start iterating over waaaaaay too many options (e.g. iterating over one of the positions in ?s ?p ?o). It makes much more sense to just pose the query as-is, and then retrieve all possible values for the variable of interest.

(thanks @RinkeHoekstra !)

Namespace prefixes in parametrized queries

you seem to have to fill in complete IRIs; maybe this can be solved
by having good preflabels where you can just enter the string,
alternatively it would be nice if you could at least use the prefix;
even better would be if you got an option list
- The last is not possible
- We could add a link to prefix.cc to do an educated guess as to what namespace a prefix would expand to.
- AR: Alternatively, could the query file itself contain information on the prefix, much like there are tags in the comments using #+ ? For example: ?dataset sdmx-dimensions:refArea ?_country in the actual query and #? ?_country: <http://.../> in the comments? Most query-writers wouldn't bother, but we could make use it to make sure some of the more frequently-used query patterns are easy to adapt.

Generic, on-demand processing of query metadata

Currently we're "statically" processing query metadata (e.g. endpoint, method, tags, etc.). An option would be to pass a list through the parameter extraMetadata of build_spec() with the parameters that we want to be processed.

index.html should follow README.md

They should stay in sync

sort parameters in dropdown list

see for example the values of the parameter municipalities

fetch queries that lack summary but do have tags

https://github.com/CLARIAH/wp4-queries/blob/9b3b6dee9e5281dbd72332d6e910fdb79077ccd0/statusbybirtyear_mult.rq would not load while https://github.com/CLARIAH/wp4-queries/blob/3a6bacaca57f6f2c69644546e4a46fdd8622bddc/statusbybirtyear_mult.rq would, even though queries with neither summary nor tags are fetched

Use SPARQLWrapper ?

Would it perhaps be an idea to use SPARQLWrapper ? It already has support for handling different formats (e.g. JSON, XML, TURTLE, etc.)

AttributeError: 'module' object has no attribute 'ACCESS_TOKEN'

Hi all,

My GRLC suddenly stopped working with the stack trace below in the GRLC logs. It requires an ACCESS_TOKEN and I've located the config.ini file that contains the following content:

[auth]
github_access_token =

Apparently, I need to give a github_access_token, but I do not know how to get one. I also wonder why public access requires an Access Token on github.

Regards, Barry

[2017-06-09 13:09:45 +0000] [177] [INFO] Booting worker with pid: 177
2017-06-09 13:39:04,773 [INFO] (server.swagger_spec) -----> Generating swagger spec for ... on commit None
2017-06-09 13:39:06,349 [DEBUG] (prov.init_prov_graph) Ingesting Git2PROV output into RDF graph
2017-06-09 13:39:06,441 [ERROR] (app.log_exception) Exception on /api/.../.../spec [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/grlc/grlc/src/server.py", line 190, in swagger_spec
    swag = utils.build_swagger_spec(user, repo, sha, static.SERVER_NAME, prov_g)
  File "/home/grlc/grlc/src/utils.py", line 242, in build_swagger_spec
    resp = requests.get(api_repo_uri, headers={'Authorization': 'token {}'.format(static.ACCESS_TOKEN)}).json()
AttributeError: 'module' object has no attribute 'ACCESS_TOKEN'

Enable content negotiation via URL

So that this:

http://localhost:8088/CEDAR-project/Queries/residenceStatus_all.csv

Calls the operation 'residenceStatus' using the HTTP header "Accept: text/csv"

Give options on values for the parameters

Currently users have no idea on what options they have about possible values for query parameters.

Support for hash-ed URIs pointing at specific commits of SPARQL queries

Nice to have/consider: would be good to be able to refer to a specific SPARQL query version (via a commit hash).

This touches upon the issue of how to sync SPARQL query versions with API versions.

Cache won't refresh when the GitHub repo is updated

Pagination

As in the GitHub API, see https://developer.github.com/guides/traversing-with-pagination/

Query parsing is odd when OPTIONAL contains parameters

The OPTIONAL clause is not parsed correctly by the parameter matchers (seems to be just discarded from the query included in the spec)

Fix bug on /api/user/repo (no trailing slash) route

Unable to create a working repository

I've created a repository for request here
https://github.com/moissinac/sparqlQueries
then I point to
http://grlc.io/api/moissinac/sparqlQueries
As you can see, it produces an error

The only one request in my repo is a copy of one in your repo (https://github.com/CLARIAH/wp4-queries)
Can you help me?

Refactor call name function in src/server.py

Make max-age cache HTTP header customizable

Anchors to specific operations?

Can anchors be used in swagger-ui to make a URL point to a specific operation?

problems with basic grlc setting

Hello,

First of all, thanks for developing this application which sounds really handy.
I am trying to work with it but have some problems, on both public and private instances of grlc.

grlc public instance:

My test repository contains 2 queries, one custom and the other from the glrc tutorial.

The API seems to work but does not show all queries:
http://grlc.io/api/ehrmama/garzoni-queries

One possible source of the pb: grlc does not seem to catch the last commit of github (API-VERSION refers to an old commit). But even if the old commit contains 2 queries, just 1 is shown.

grlc local instance:

I followed the installation instruction with docker. The instance is running (i.e. I can see the home page), but the api/ehrmann/garzoni-query page shows this error:


500 : <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>500 Internal Server Error</title> <h1>Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p> http://128.178.60.48:8001/api/ehrmama/garzoni-queries/spec

And the running instance outputs this message:

grlc_1 | 2017-07-10 10:29:54,552 [INFO] (server.swagger_spec) -----> Generating swagger spec for /ehrmama/garzoni-queries on commit None
grlc_1 | 2017-07-10 10:29:55,545 [DEBUG] (prov.init_prov_graph) Ingesting Git2PROV output into RDF graph
grlc_1 | 2017-07-10 10:29:55,634 [INFO] (connectionpool._new_conn) Starting new HTTPS connection (1): api.github.com
grlc_1 | 2017-07-10 10:29:56,091 [DEBUG] (connectionpool._make_request) "GET /repos/ehrmama/garzoni-queries HTTP/1.1" 401 83
grlc_1 | 2017-07-10 10:29:56,097 [INFO] (connectionpool._new_conn) Starting new HTTPS connection (1): api.github.com
grlc_1 | 2017-07-10 10:29:56,494 [DEBUG] (connectionpool._make_request) "GET /repos/ehrmama/garzoni-queries/commits HTTP/1.1" 401 83
grlc_1 | 2017-07-10 10:29:56,498 [ERROR] (app.log_exception) Exception on /api/ehrmama/garzoni-queries/spec [GET]
grlc_1 | Traceback (most recent call last):
grlc_1 |   File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
grlc_1 |     response = self.full_dispatch_request()
grlc_1 |   File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
grlc_1 |     rv = self.handle_user_exception(e)
grlc_1 |   File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
grlc_1 |     reraise(exc_type, exc_value, tb)
grlc_1 |   File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
grlc_1 |     rv = self.dispatch_request()
grlc_1 |   File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
grlc_1 |     return self.view_functions[rule.endpoint](**req.view_args)
grlc_1 |   File "/home/grlc/grlc/src/server.py", line 211, in swagger_spec
grlc_1 |     swag = utils.build_swagger_spec(user, repo, sha, static.SERVER_NAME, prov_g)
grlc_1 |   File "/home/grlc/grlc/src/utils.py", line 248, in build_swagger_spec
grlc_1 |     commit_list = [c['sha'] for c in commits]
grlc_1 | TypeError: string indices must be integers

The on commit None on the first line seems a bit strange.

I tried with another repository, but get the exact same error.

Ideally I have quite some queries which I would like to serve from this local instance.

Thanks a lot for your feedback.

grlc appears to have a memory for broken files

Hi, when making a mistake in an .rq file, grlc won't load your repo, exciting with error:

500 : <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>500 Internal Server Error</title> <h1>Internal Server Error</h1> <p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p> http://grlc.io/api/rlzijdeman/uncertainty-sparql/spec

That's all fine, but it appears that if you fix the mistake, grlc still won't load the repo. This might also be related to #61. When I tried to fix it, at some point I decided to go with a no-fail option: namely by pasting the code from working a .rq file in the .rq file at hand. The code is from [1]

In the forked repo containing the code from dbpedia_test.rq [2] grlc fails [3]

However in a completely new repo [4] the pasted code works for grlc [5]

I'm experiencing this with other repo's as well, e.g. [6]

@albertmeronyo can you think of what would create such a 'memory'?

Make compatible with Python 3

The python package grlc currently only works in Python 2. It would be nice to also make it compatible with Python 3.

queries won't load

Hi, I'm using the Clariah WP-4 queries [1], however my query doesn't show in grlc (while it is showing up in [1]. The three queries I have been experimenting with are:

hisco_hsn_1.rq
hisco_hsn_q.rq
hisco_hsn_q2.rq

hisco_hsn_1.rq is a copy of one of the canfam files and IS showing in grlc.
hiscio_hsn_q and q2 contain queries I made myself that are working in yasgui [2]. The difference between q and q2 is the explication of the endpoint. However, neither of them are showing up in grlc (although they thus are available in [1]).

Do you have any clue why this is the case?

[1] https://github.com/CLARIAH/wp4-queries
[2] http://yasgui.org/short/4yK2TTf6e

Decorators for request types (GET, PUT, etc.)

Adequate processing of optional parameters

When a parameter is optional, grlc should:

If it has a value, rewrite query to replace variable name with value
If it has not a value, don't rewrite anything

prefix.txt file with common prefixes to all queries in the repo

Option to select 'all values' for parameters

So users can query for all years or all municipalities

Auto-update API spec cache when a repo is pushed via webhooks

grlc could re-generate the API spec on the background, and update it in the cache, as soon as a user pushes is SPARQL query repo and sends the POST HTTP request to the grlc server

Parsing errors with syntactically correct queries

For instance in https://github.com/RinkeHoekstra/semanticweb-sparql-queries, for 14_describe.rq and 22_sameAs_dbpedia.rq and 23_sameAs_lodcache.rq

Add examples to index landing page

Logger cleanup

Refresh cache button

Forces the spec to be re-generated

Parameter enumerations don't work if not in object position

Enable the pattern matching for parameter enumerations to recognize the triple patterns when the parameter comes in subject or predicate position

Not able to access the instance

Hi, I think I'm missing something during the setup and am not able to access the grlc instance, I've tried the Docker way as well as the Python setup.

Authenticate grlc requests to GitHub API

Docker image at hub.docker.com

Allow to leave variable unbound in parameterised queries (with dropdown)

In some cases, it is inconvenient to have to select a value for a ?_varname_iri parameter: it is better to leave it unbound. An example is where you want to return values for all years, rather than a specific year.

It would be good to allow users to select a '?' or similar option in the dropdown that makes grlc ignore the specified parameter/variable when rewriting the query.

Perhaps also return the bindings in the SPARQL results.

Check content negotiation in endpoints

Some endpoints have trouble with content negotiation, check e.g. https://data.pdok.nl/sparql
(thanks @reinvantveer)

Bad syntax (expected item in list or ')')

Hi,

I have created a GitHub repository that breaks GRLC because it's commit history contains files with invalid characters. Try: http://grlc.io/api/barrynl/uncertainty-sparql

The error in the log can be found at the end of this issue.

I have reduced the repository to a minimal example to debug the error and it is caused by this commit:

barrynl/uncertainty-sparql@5419d3e

The commit contains a filename with parenthesis (give-me-all-uncertainty-values-(and-causes)-per-sentences.nq) and these parenthesis are directly included in the temp.prov.ttl file which breaks the turtle parser because URIs cannot contain parenthesis.

So, does anyone know a work around (maybe remove this particular commit)? Currently, I've created a new repository with the same files but without the commit history and this indeed works correct.

I think the code of GRLC can be improved by checking the commit files for filenames that contain characters that cannot be used in URIs.

Regards,

Barry

2017-06-16 17:47:49,900 [ERROR] (app.log_exception) Exception on /api/barrynl/uncertainty-sparql/spec [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/grlc/grlc/src/server.py", line 188, in swagger_spec
    prov_g = grlcPROV(user, repo)
  File "/home/grlc/grlc/src/prov.py", line 31, in __init__
    self.init_prov_graph()
  File "/home/grlc/grlc/src/prov.py", line 45, in init_prov_graph
    self.prov_g.parse('temp.prov.ttl', format='turtle')
  File "/usr/local/lib/python2.7/site-packages/rdflib/graph.py", line 1037, in parse
    parser.parse(source, self, **args)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1870, in parse
    p.loadStream(source.getByteStream())
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 434, in loadStream
    return self.loadBuf(stream.read())    # Not ideal
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 440, in loadBuf
    self.feed(buf)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 466, in feed
    i = self.directiveOrStatement(s, j)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 487, in directiveOrStatement
    j = self.statement(argstr, i)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 725, in statement
    j = self.property_list(argstr, i, r[0])
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1081, in property_list
    j = self.verb(argstr, i, v)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 814, in verb
    j = self.prop(argstr, i, r)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 826, in prop
    return self.item(argstr, i, res)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 829, in item
    return self.path(argstr, i, res)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 837, in path
    j = self.nodeOrLiteral(argstr, i, res)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1431, in nodeOrLiteral
    j = self.node(argstr, i, res)
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1027, in node
    "expected item in list or ')'")
  File "/usr/local/lib/python2.7/site-packages/rdflib/plugins/parsers/notation3.py", line 1615, in BadSyntax
    raise BadSyntax(self._thisDoc, self.lines, argstr, i, msg)
BadSyntax: at line 188 of <>:
Bad syntax (expected item in list or ')') at ^ in:
"...b4b6a5c1813b8 .
result:file-give-me-all-uncertainty-values-(^and-causes)-per-sentences-nq   a       prov:Entity ;
            rdfs:label      "g..."

Looking for subfolders named "queries"

Not all projects have their queries in a separate repository but have the queries in a subfolder.

Basic caching

Store last generated spec in an obj of the form

{repo_uri : { date: date_spec_generated, spec: spec_generated }}

When generating the spec, check if the date_spec_generated is greater than the pushed_at value at the repo API. Then, use the stored spec_generated in the obj. Else, generate the spec normally and update the cache.

grlc sends following query (?city=http://dbpedia.org/resource/London)
SELECT * WHERE{ http://dbpedia.org/resource/London dbo:populationTotal ?populationTotal }

instead of
SELECT * WHERE{ <http://dbpedia.org/resource/London> dbo:populationTotal ?populationTotal }

Local instance landing page

dev.grlc.io

Set up a dev version of grlc.io

clariah / grlc Goto Github PK

grlc's Introduction

What is grlc?

Quick tutorial

Usage

Query location

From a GitHub repository

From a GitLab repository

From local storage

From a specification file

Specification file syntax

grlc generated API

grlc query execution

Decorator syntax

summary

description

endpoint

pagination

method

tags

defaults

enumerate

endpoint_in_url

transform

endpoint-method

Example APIs

Install and run

grlc.io

Docker

Pip

Prerequisites

pip install

Standalone server

Using a WSGI server

grlc library

grlc server configuration

Git access token

Contribute!

Related tools

This is what grlc users are saying

Academic publications

grlc's People

Contributors

Stargazers

Watchers

Forkers

grlc's Issues

Recommend Projects

Recommend Topics

Recommend Org

`summary`

`description`

`endpoint`

`pagination`

`method`

`tags`

`defaults`

`enumerate`

`endpoint_in_url`

`transform`

`endpoint-method`