Code Monkey home page Code Monkey logo

bl_lookup's Introduction

Build Status

Biolink Model lookup service

Introduction

The Biolink Model defines a set of common concepts for use in Translator (See here for a Translator glossary). These include semantic types for entities, as well as the relations between them. These concepts are organized into an inheritance hierarchy capturing different granularities of description. Furthermore, each concept contains metadata relating the concept to ontologies.

The Biolink Lookup Service provides a computational interface to the model, including access to previous versions.

Use

Most users will not run their own service, but will make use of the publicly provided service. Several functions are provided, including the ability to look up concepts by name or URI, or to look up ancestors (superclasses) or descendants (subclasses) of concepts.

Examples of use can be found on the live apidocs page, or in the demonstration notebook.

Installation

Note: This environment expects Python version 3.8.

Create a virtual environment and activate.

python -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements.txt    

Run web server.

python main.py --host 0.0.0.0 --port 8144

Docker

You may also download and implement the Docker container located in the Docker hub repo: renciorg\bl_lookup.

cd <code base>
docker build -t bl_lookup .

Launch

docker run -it \ 
    -p <port>:8144 \ 
    bl_lookup 

Run Service

http://"host name or IP":"port"/apidocs

Helm Charts

Example helm charts for deploying the container in kubernetes can be found here

bl_lookup's People

Contributors

cbizon avatar evandietzmorris avatar kshefchek avatar phillipsowen avatar yaphetkg avatar

Watchers

 avatar  avatar

bl_lookup's Issues

BL Toolkit and versioning

One goal of bl_lookup is that it will be able to serve older versions of the model. So if you are not ready for the latest release, you can ping bl_lookup with a previous release tag.

The problem here is that we are using the biolink toolkit to access the data. And it has versioning dependencies. i.e. the most recent version of the toolkit cannot handle version 1.2.1 of the model.

So there are a few possibilities:

  1. Make biolink toolkit backward compatible to older versions of the model
  2. Keep multiple versions of the toolkit around to handle different model versions. Might be somewhat of a pain in terms of managing the python repos? Also will need some docs on which versions of the toolkit handle which versions of the model.
  3. Abandon the idea of serving old versions, or modify it to only those versions compatible with the latest toolkit
  4. Don't use the toolkit, but I think that then we just end up reproducing this problem in bl-lookup code.

@deepakunni3 @PhillipsOwen thoughts?

API Call Description text

via slack:

Vincent Emonet (Clinical Data Services Provider) 3:20 PM
Thanks for the services!
Would it be possible to have a short text in each API call descriptions about how the called service is implemented? (e.g. "we search in the BioLink YAML file")
And when there are exception in the implementation (e.g. for the RO lookup)
So that this can easily reversed engineered/understand later? (edited)

Lookup predicate should include qualifiers

We need to be able to take a qualified edge as input. Right now you can send a biolink predicate and for instance get back an inversion, but you can't send qualifiers.

not getting descendants of increase_amount_or_activity_of

There is a predicate mixin in > 2.2.2 (including 2.2.3 which is in the service). In the BMT, this works:

>>> tk.get_descendants('biolink:increases_amount_or_activity_of')
['increases amount or activity of', 'increases abundance of', 'increases expression of', 'increases stability of', 'increases uptake of', 'decreases degradation of', 'increases secretion of', 'increases metabolic processing of', 'increases folding of', 'increases localization of', 'increases synthesis of', 'increases response to', 'increases splicing of', 'increases mutation rate of', 'increases transport of', 'increases activity of', 'increases molecular modification of', 'increases molecular interaction']

But it fails in bl_lookup/get_descendants

Consider using official biolink-model python

it looks like you have custom python logic for working with blmod:

https://github.com/TranslatorIIPrototypes/bl_lookup/blob/master/bl_lookup/bl.py

consider using
https://pypi.org/project/biolink-model/

Which is just the autogen python code from the yaml also here: https://github.com/biolink/biolink-model/blob/master/biolink/model.py

(it might be better to use this if you want to switch versions dynamically)

@deepakunni3 can advise, maybe @shefcheck can help

@hsolbrig may also be interested (the URL for the service is here: https://bl-lookup-sri.renci.org/apidocs/). It strikes me we could have a completely generic API in the biolinkml framework that is driven entirely by the yaml... but that is probably overkill for our purposes here

LocalName, SlotDefinition

We return a json representation of an element.

So we do something like as_dict(bmt.get_element(element_name)) and then turn that into JSON.

But there are some values of that dict that contain un-jsonable elements (LocalName, SlotDefinition). At the moment, I've just filtered out those element keys, but it would be pretty straightforward to clean that up and convert those entities into dictionaries as well.

Brought up to BMT here: biolink/biolink-model-toolkit#21

Depending on how that is resolved or not, we might take different approaches.

Add support for Biolink model versions 1.8.1 and 1.8.2?

I noticed that this service doesn't seem to like Biolink model versions 1.8.1 and 1.8.2 at the moment.

e.g., this request:

curl -X GET "https://bl-lookup-sri.renci.org/bl/affects/descendants?version=1.8.1" -H "accept: application/json"

returns No version '1.8.1' available

would it be possible to add support for these?

using a concept of "related to" results in app exception thrown

this is occurring in bl-lookup app versions 1.1.6, 1.1.7 and 1.1.8.

example:
https://bl-lookup-sri.renci.org/bl/related%20to?version=latest or https://bl-lookup-sri.renci.org/bl/related%20to?version=2.2.5

either example results in a 503 error on the client. each request generates differing stack dumps:

[2022-01-27 16:08:14 +0000] [1] [ERROR] Exception occurred while handling uri: 'http://bl-lookup-sri.renci.org/bl/biolink:related_to?version=latest'
Traceback (most recent call last):
ERROR:sanic.root:Exception occurred while handling uri: 'http://bl-lookup-sri.renci.org/bl/biolink:related_to?version=latest'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/sanic/app.py", line 939, in handle_request
response = await response
File "/home/nru/bl_lookup/server.py", line 55, in properties
return response.json(props)
File "/usr/local/lib/python3.9/site-packages/sanic/response.py", line 210, in json
dumps(body, **kwargs),
TypeError: Annotation(tag='biolink:canonical_predicate', value='True', extensions={}, annotations={}) is not JSON serializable
File "/usr/local/lib/python3.9/site-packages/sanic/app.py", line 939, in handle_request
response = await response
File "/home/nru/bl_lookup/server.py", line 55, in properties
return response.json(props)
File "/usr/local/lib/python3.9/site-packages/sanic/response.py", line 210, in json
dumps(body, **kwargs),
TypeError: Annotation(tag='biolink:canonical_predicate', value='True', extensions={}, annotations={}) is not JSON serializable
ERROR:sanic.root:Exception occurred while handling uri: 'http://bl-lookup-sri.renci.org/bl/related_to?version=2.2.5'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/sanic/app.py", line 939, in handle_request
response = await response
File "/home/nru/bl_lookup/server.py", line 55, in properties
return response.json(props)
File "/usr/local/lib/python3.9/site-packages/sanic/response.py", line 210, in json

[2022-01-27 16:08:51 +0000] [1] [ERROR] Exception occurred while handling uri: 'http://bl-lookup-sri.renci.org/bl/related_to?version=2.2.5'
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/sanic/app.py", line 939, in handle_request
response = await response
File "/home/nru/bl_lookup/server.py", line 55, in properties
return response.json(props)
File "/usr/local/lib/python3.9/site-packages/sanic/response.py", line 210, in json
dumps(body, **kwargs),
TypeError: Annotation(tag='biolink:canonical_predicate', value='True', extensions={}, annotations={}) is not JSON serializable
dumps(body, **kwargs),
TypeError: Annotation(tag='biolink:canonical_predicate', value='True', extensions={}, annotations={}) is not JSON serializable

Not returning ancestors for GenomicEntity

From Luke Veenhuis:

Hello Chris. I had a question about the biolink model lookup service and I saw your name listed on the smart api page. I am trying the following query and am receiving an unexpected issue.
https://bl-lookup-sri.renci.org/bl/biolink:GenomicEntity/ancestors?version=latest
the error i am getting is "No concept 'genomicentity'"
This is especially odd as when i do the same query on biolink:Gene, biolink:GenomicEntity appears as expected.
do you have any information on this?

My suspicion is that it's related to GenomicEntity being a mixin.

SRI Technical Debt: Archive?

Per SRI call 7/9
BL Lookup Service - get rid off? Who is using it now? Anyone dependent? BL Toolkit should be the replacement.

Put on ITRB

I think we do not have instances of bl-lookup running on ITRB environments (at least I don't see them in the registry?)

I hadn't previously worried about it, but the current example of wanting to have a biolink 3 version in dev and a biolink 2 version in prod makes me think we should go ahead and put it everywhere.

Are mixins ancestors?

So we've started seeing about upgrading to BioLink 2.1 and came across this: when we ask for ancestors of biolink:SmallMolecule we also get mixins:
https://bl-lookup-sri.renci.org/bl/SmallMolecule/ancestors?version=2.1.0

1 | "biolink:ChemicalEntity"
2 | "biolink:NamedThing"
3 | "biolink:Entity"
4 | "biolink:PhysicalEssence"
5 | "biolink:PhysicalEssenceOrOccurrent"
6 | "biolink:ChemicalOrDrugOrTreatment"
7 | "biolink:ChemicalEntityOrGeneOrGeneProduct"
8 | "biolink:ChemicalEntityOrProteinOrPolypeptide"

Is this intended and/or desirable?

thanks!

Expand Caseing?

Currently, callers need to use snake case to look up concepts. So if you want to look up "chemical substance" you have to look up "chemical_substance". and if you want to look up "negatively regulates, entity to entity" you have to look up "negatively_regulates__entity_to_entity" (note double underscore).

This has a few problems:

  1. it's not obvious
  2. The naming of things in the model yaml is different from how those things should be serialized in e.g. neo4j. Recently, the neo4j guidance changed to using CamelCase for semantic types in neo4j.

So it's not clear what kinds of inputs we should accept, but we should probably be more accepting and
make an effort to convert things where we can.

Add input as as ancestor of itself

I noticed that the Biolink Lookup Service seems to return only proper ancestors (i.e., excluding the input item), but all descendants (i.e., including the input item).

for example, this ancestor query for MolecularMixture:

curl -X GET "https://bl-lookup-sri.renci.org/bl/MolecularMixture/ancestors?version=2.1.0" -H  "accept: application/json"

does not return MolecularMixture:

[
  "biolink:ChemicalMixture",
  "biolink:ChemicalEntity",
  "biolink:NamedThing",
  "biolink:Entity",
  "biolink:PhysicalEssence",
  "biolink:PhysicalEssenceOrOccurrent",
  "biolink:ChemicalOrDrugOrTreatment",
  "biolink:ChemicalEntityOrGeneOrGeneProduct",
  "biolink:ChemicalEntityOrProteinOrPolypeptide"
]

but this descendant query for MolecularMixture:

curl -X GET "https://bl-lookup-sri.renci.org/bl/MolecularMixture/descendants?version=2.1.0" -H  "accept: application/json"

does return MolecularMixture:

[
  "biolink:MolecularMixture",
  "biolink:Drug"
]

is this deliberate? the inconsistency seems a bit odd to me.

Mixins included in /ancestors, but not /descendants?

here's an example - this query for ancestors of NucleicAcidEntity:

curl -X GET "https://bl-lookup-sri.renci.org/bl/NucleicAcidEntity/ancestors?version=2.1.0" -H  "accept: application/json"

returns ancestors including some mixins, like PhysicalEssence:

[
  "biolink:MolecularEntity",
  "biolink:ChemicalEntity",
  "biolink:NamedThing",
  "biolink:Entity",
  "biolink:GenomicEntity",
  "biolink:ThingWithTaxon",
  "biolink:PhysicalEssence",
  "biolink:PhysicalEssenceOrOccurrent",
  "biolink:OntologyClass",
  "biolink:ChemicalOrDrugOrTreatment",
  "biolink:ChemicalEntityOrGeneOrGeneProduct",
  "biolink:ChemicalEntityOrProteinOrPolypeptide"
]

but a query for descendants of NucleicAcidEntity:

curl -X GET "https://bl-lookup-sri.renci.org/bl/NucleicAcidEntity/descendants?version=2.1.0" -H  "accept: application/json"

does not return any mixins:

[
  "biolink:NucleicAcidEntity",
  "biolink:Transcript",
  "biolink:RNAProduct",
  "biolink:NoncodingRNAProduct",
  "biolink:MicroRNA",
  "biolink:SiRNA",
  "biolink:RNAProductIsoform",
  "biolink:Exon",
  "biolink:CodingSequence"
]

is that how it's supposed to work?

I admit I'm still kind of new to the concept of mixins, so it may just be that my understanding is lacking here. but if they're supposed to be included, I'd think the descendants of NucleicAcidEntity would look something like this?

[
                "biolink:UnclassifiedOntologyClass",
                "biolink:GeneProductIsoformMixin",
                "biolink:CodingSequence",
                "biolink:OntologyClass",
                "biolink:GenomicEntity",
                "biolink:NucleicAcidEntity",
                "biolink:RNAProduct",
                "biolink:Transcript",
                "biolink:Exon",
                "biolink:SiRNA",
                "biolink:RNAProductIsoform",
                "biolink:TaxonomicRank",
                "biolink:PhysicalEssence",
                "biolink:GeneOntologyClass",
                "biolink:GeneProductMixin",
                "biolink:RelationshipType",
                "biolink:MicroRNA",
                "biolink:NoncodingRNAProduct"
            ]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.