rtxteam / rtx Goto Github PK
View Code? Open in Web Editor NEWSoftware repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
Home Page: https://arax.ncats.io/
License: MIT License
Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
Home Page: https://arax.ncats.io/
License: MIT License
From @saramsey on November 27, 2017 18:45
Do not do this until after the demo.
Copied from original issue: dkoslicki/NCATS#117
Hi @saramsey, it looks like QueryNCBIeUtils.get_mesh_terms_for_mesh_uid
is always returning an empty answer:
>>>QueryNCBIeUtils.get_medgen_uid_for_omim_id('614332')
{482154, 814824}
>>>QueryNCBIeUtils.get_mesh_terms_for_mesh_uid({482154, 814824})
[]
and
>>>QueryNCBIeUtils.get_medgen_uid_for_omim_id('600320')
{325371}
>>>QueryNCBIeUtils.get_mesh_terms_for_mesh_uid({325371})
[]
and a bunch of other similar examples.
Yesterday Q2 was working, now Q1 is not working. Maybe attempted fix for Q1 broke Q2?
python3 Q2Solution.py -r 'DOID:1686' -d 'physostigmine'
Traceback (most recent call last):
File "Q2Solution.py", line 187, in
main()
File "Q2Solution.py", line 184, in main
res = answerQ2(drug, disease, k)
File "Q2Solution.py", line 70, in answerQ2
RU.weight_graph_with_google_distance(g)
File "/mnt/data/orangeboard/test/RTX/code/reasoningtool/QuestionAnswering/ReasoningUtilities.py", line 693, in weight_graph_with_google_distance
gd_temp = QueryNCBIeUtils.normalized_google_distance(source_mesh_term, target_mesh_term, mesh1=mesh1, mesh2=mesh2)
File "/mnt/data/orangeboard/test/RTX/code/reasoningtool/QueryNCBIeUtils.py", line 232, in normalized_google_distance
nij = QueryNCBIeUtils.get_pubmed_hits_count('({mesh1}) AND ({mesh2})'.format(mesh1=mesh1_str_decorated,
UnboundLocalError: local variable 'mesh1_str_decorated' referenced before assignment
From @saramsey on November 26, 2017 19:16
This is a placeholder issue for the development of Random Walk with Restart for multi-path relatedness between a "Q1 disease" and a genetic condition (OMIM).
Copied from original issue: dkoslicki/NCATS#98
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "/usr/lib/python3.5/timeit.py", line 213, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "", line 6, in inner
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "BuildMasterKG.py", line 337, in make_master_kg
seed_and_expand_kg_q2(num_expansions=3)
File "BuildMasterKG.py", line 250, in seed_and_expand_kg_q2
bne.expand_all_nodes()
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 343, in expand_all_nodes
self.expand_node(node)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 331, in expand_node
expand_method(self)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 82, in expand_ncbigene_microrna
self.orangeboard.add_rel('participates_in', 'gene_ontology', node1, node2)
NameError: name 'node1' is not defined
From @saramsey on January 10, 2018 1:10
Copied from original issue: dkoslicki/NCATS#133
@saramsey Here is a prioritized list of node types and relationship types for which it's desirable to explore what sorts of properties we can pull from various KS's:
Nodes:
(source node type, relationship type, target node type):
If for each of these you let me know what sort of data is being returned by the KS, I can take a look and we can pick/choose which properties to include.
@dkoslicki In order for the restatedQuestion to be properly filled in for query results, the translate() function needs to return that information. It did under the old system, but no longer does.
The relevant code is line 669 or search for FIXME
I can't figure out how to get the restated question into that slot. Would you fix that?
do git pull in NewStdAPI first.
Eg. P08563
For example, disont_disease names are DOID's while the descriptions are human readable. In contrast, pharos_drugs have human readable names, and CHEMBL descriptions.
I vote we standardize to identifiers for names, human readable descriptions. Or perhaps even better a more uniform node property naming convention such as:
ID: identifier (DOID, CHEMBL, etc.)
name: human readable name
description: pulled from MESH
etc.
From @saramsey on January 6, 2018 0:7
Q0 and Q3 are now working, but I don’t get Q1 and Q2 to work. I futzed around with getting Q1 to work. It looks like all the neo4j connection commands were hard-wired to lysine and they weren’t working. So I changed them all to localhost. Should that work? But now I get a another connection error.
cd /mnt/data/orangeboard/code/NCATS/code/reasoningtool
grep lysine *.py
(now mostly commented out and replaced with localhost)
python3 Q1Solution.py
…… lots of errors ending with:
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
ideas?
I suspect it can’t connect to neo4j, but not sure…
Copied from original issue: dkoslicki/NCATS#128
Just a note that when you query two MeSH terms, it give the joint and the marginals:
Note that with https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=%28%22malaria%22%5BMeSH%20Terms%5D%29%20AND%20%22osteoarthritis%22%5BMeSH%20Terms%5D&retmode=json&retmax=1000
You get back three counts corresponding to joint, marginal, marginal:
{
"header": {
"type": "esearch",
"version": "0.3"
},
"esearchresult": {
"count": "4",
"retmax": "4",
"retstart": "0",
"idlist": [
"16425715",
"11775318",
"3300734",
"5982767"
],
"translationset": [
],
"translationstack": [
{
"term": "\"malaria\"[MeSH Terms]",
"field": "MeSH Terms",
"count": "60405",
"explode": "Y"
},
{
"term": "\"osteoarthritis\"[MeSH Terms]",
"field": "MeSH Terms",
"count": "54506",
"explode": "Y"
},
"AND"
],
"querytranslation": "\"malaria\"[MeSH Terms] AND \"osteoarthritis\"[MeSH Terms]"
}
}
@edeutsch @saramsey I've noticed that we occasionally get the following printed out to screen when using QueryNCBIeUtils
:
HTTP response status code: 502 for URL:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=%28Van%20Buchem%20disease%20type%202%29%20AND%20%28Osteoporosis%5BMeSH%20Terms%5D%29&retmode=json&retmax=1000
This does not necessarily mean that the QXSolution.py
will fail (since I will just then put in the max google distance), but I imagine that having something besides the JSON formatted response printed may mess up the UI/API. Will this be an issue? I don't know if requests is using stdout or stderr to print these as I'm not familiar with the requests package.
From @erikyao on January 12, 2018 18:59
When running test_geneprof_id_to_transcription_factor_gene_symbols
of tests/QueryGeneProfTestCase
, QueryGeneProf.geneprof_id_to_transcription_factor_gene_symbols
sometimes returns an empty set due to Genprof's 500 response.
Test code:
ret_set = QueryGeneProf.geneprof_id_to_transcription_factor_gene_symbols(16269) # 'HMOX1'
Debugging message:
HTTP response status code: 500 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/regulation/binary/by.target/human/16269.json?with-sample-info=true
Cause of 500 reported by Geneprof:
java.lang.OutOfMemoryError: unable to create new native thread
We may need to add exception-handling code in this case.
Copied from original issue: dkoslicki/NCATS#134
see issue #22 for the back-story
Is it possibly a requests-cache memory leak?
@edeutsch
Hi Eric,
I noticed in RTXQuery.py, line 88 and onwards, there is the following code:
if id == 'Q3':
targets = qph.query_drug_name_to_targets(terms[0])
if targets:
list = '<UL>\n'
for target in targets:
list += "<LI> "+target["name"]+"\n"
list += "</UL>\n"
codeString = "OK"
result = [ { "id": 537, "code": 1, "codeString": codeString, "message": "AnswerFound", "result": targets, "text": [ terms[0]+" is known to target: "+list ] } ]
self.logQuery(id,codeString,terms)
else:
codeString = "DrugNotFound"
result = [ { "id": 537, "code": 11, "codeString": codeString, "message": "DrugNotFound", "text": [ "Unable to find drug '"+terms[0]+"'." ] } ]
self.logQuery(id,codeString,terms)
return(result);
Q3 as it stands now can allow for all different node types, so checking query_drug_name_to_targets isn't the best idea here. For example:
QT.find_question_parameters("what phenotype is phenotype associated with malaria")
Out[649]:
{'corpus_index': 3,
'error_code': None,
'error_message': None,
'input_text': 'what phenotype is phenotype associated with malaria',
'terms': {'relationship_type': 'phenotype_assoc_with',
'source_name': 'DOID:12365',
'target_label': 'phenont_phenotype'}}
And the node DOID:12365 has been found by QuestionTranslator.py (it checks if nodes exist in the knowledge graph). So the Q3Solution.py would run on these input terms, but the RTXQuery isn't letting it since it thinks it's just looking for drugs. Perhaps an error message like "Unknown term" or, more specifically, "entity not in knowledge graph" is better here, along with removing the check for query_drug_names_to_targets.
From @saramsey on January 9, 2018 22:32
Copied from original issue: dkoslicki/NCATS#132
I've been looking at the COP for the treatment of osteoarthritis by naproxen, and it appears that one of the reason why we are not getting the correct answer is that some of the relevant nodes are not expanded. For example, the PTGS1 gene is a ("the relevant") target of naproxen, and yet:
match (n:uniprot_protein{description:"PTGS1"}) return n.expanded
False
And so I cannot connect PTGS1 to any relevant piece of anatomy:
match p=(s:pharos_drug{name:"naproxen"})-[:targets]-(:uniprot_protein{description:"PTGS1"})-[]-(:anatont_anatomy) return p limit 50
(no changes, no records)
Was there an error in the Orangeboard construction that caused this node not to be expanded? I had assumed that we had seeded with all drugs and expanded from there. Yet:
match p=(s:pharos_drug{name:"naproxen"})-[:targets]-(t:uniprot_protein) return t.name, t.description, t.expanded
╒════════╤═══════════════╤════════════╕
│"t.name"│"t.description"│"t.expanded"│
╞════════╪═══════════════╪════════════╡
│"P25101"│"EDNRA" │false │
├────────┼───────────────┼────────────┤
│"P0DJD9"│"PGA5" │false │
├────────┼───────────────┼────────────┤
│"P23219"│"PTGS1" │false │
├────────┼───────────────┼────────────┤
│"Q15722"│"LTB4R" │false │
├────────┼───────────────┼────────────┤
│"P11712"│"CYP2C9" │true │
├────────┼───────────────┼────────────┤
│"P09917"│"ALOX5" │true │
├────────┼───────────────┼────────────┤
│"O14842"│"FFAR1" │false │
├────────┼───────────────┼────────────┤
│"Q07869"│"PPARA" │false │
├────────┼───────────────┼────────────┤
│"O43174"│"CYP26A1" │false │
├────────┼───────────────┼────────────┤
│"P24530"│"EDNRB" │false │
├────────┼───────────────┼────────────┤
│"P37231"│"PPARG" │true │
├────────┼───────────────┼────────────┤
│"Q8TCC7"│"SLC22A8" │false │
├────────┼───────────────┼────────────┤
│"P49286"│"MTNR1B" │false │
├────────┼───────────────┼────────────┤
│"P08183"│"ABCB1" │true │
├────────┼───────────────┼────────────┤
│"P01375"│"TNF" │true │
├────────┼───────────────┼────────────┤
│"P00374"│"DHFR" │true │
├────────┼───────────────┼────────────┤
│"P08684"│"CYP3A4" │true │
├────────┼───────────────┼────────────┤
│"P35354"│"PTGS2" │true │
├────────┼───────────────┼────────────┤
│"P33261"│"CYP2C19" │false │
├────────┼───────────────┼────────────┤
│"P42330"│"AKR1C3" │false │
├────────┼───────────────┼────────────┤
│"P16473"│"TSHR" │false │
├────────┼───────────────┼────────────┤
│"Q16665"│"HIF1A" │true │
├────────┼───────────────┼────────────┤
│"P48039"│"MTNR1A" │false │
└────────┴───────────────┴────────────┘
This was something we had discussed previously: recognizing when a node is absent from the KG and then populating it automatically. For example, it would be nice to have "DOID:9849" in the KG, but it's not currently there. This would require making the KG not read only (which would be a risk) but the tradeoff is probably worth it.
From @saramsey on November 18, 2017 4:25
can use BioLink for this, most likely
Copied from original issue: dkoslicki/NCATS#40
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "/usr/lib/python3.5/timeit.py", line 213, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "", line 6, in inner
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "BuildMasterKG.py", line 337, in make_master_kg
seed_and_expand_kg_q2(num_expansions=3)
File "BuildMasterKG.py", line 250, in seed_and_expand_kg_q2
bne.expand_all_nodes()
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 343, in expand_all_nodes
self.expand_node(node)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 331, in expand_node
expand_method(self)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 82, in expand_ncbigene_microrna
self.add_rel('participates_in', 'gene_ontology', node1, node2)
AttributeError: 'BioNetExpander' object has no attribute 'add_rel'
rt@ip-172-31-43-220:/reasoningtool$/reasoningtool$
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/UBA52.json
Traceback (most recent call last):
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "/usr/lib/python3.5/timeit.py", line 213, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "", line 6, in inner
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "BuildMasterKG.py", line 337, in make_master_kg
seed_and_expand_kg_q2(num_expansions=3)
File "BuildMasterKG.py", line 250, in seed_and_expand_kg_q2
bne.expand_all_nodes()
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 343, in expand_all_nodes sel expand_node(node)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 331, in expand_node exp d_method(self)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 82, in expand_ncbigene_microrna self.add_rel('participates_in', 'gene_ontology', node1, node2)
AttributeError: 'BioNetExpander' object has no attribute 'add_rel'
rt@ip-172-31-43-220:
HTTP response status code: 404 for URL:
https://reactome.org/ContentService/data/complexes/UniProt/O95751
HTTP response status code: 404 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/LDOC1.json
https://api.monarchinitiative.org/api/bioentity/phenotype/HP:0010167/anatomy
Status code 500 for url: https://api.monarchinitiative.org/api/bioentity/phenotype/HP:0010167/anatomy
HTTP response status code: 404 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/WIPF1.json
HTTP response status code: 404 for URL:
https://reactome.org/ContentService/data/complexes/UniProt/O95865
HTTP response status code: 404 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/DDAH2.json
HTTP response status code: 404 for URL:
https://reactome.org/ContentService/data/complexes/UniProt/A0A087WUI6
HTTP response status code: 404 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/PIBF1.json
HTTP response status code: 404 for URL:
https://reactome.org/ContentService/data/complexes/UniProt/Q86YD7
Number of rels: 604000; elapsed time: 14748.08 s
HTTP response status code: 404 for URL:
http://www.geneprof.org/GeneProf/api/gene.info/gp.id/human/C_NAME/UBA52.json
Traceback (most recent call last):
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "/usr/lib/python3.5/timeit.py", line 213, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "", line 6, in inner
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "BuildMasterKG.py", line 337, in make_master_kg
seed_and_expand_kg_q2(num_expansions=3)
File "BuildMasterKG.py", line 250, in seed_and_expand_kg_q2
bne.expand_all_nodes()
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 343, in expand_all_nodes
self.expand_node(node)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 331, in expand_node
expand_method(self)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 82, in expand_ncbigene_microrna
self.add_rel('participates_in', 'gene_ontology', node1, node2)
AttributeError: 'BioNetExpander' object has no attribute 'add_rel'
From @saramsey on November 24, 2017 15:40
Copied from original issue: dkoslicki/NCATS#65
Here is a proposed minimal integration:
https://docs.google.com/spreadsheets/d/1zXitcR1QjHyh6WocukgshSR7IoAVg7MJQG-HNh96Jec/edit#gid=3366698
Here is a proposed maximal integration:
https://docs.google.com/spreadsheets/d/1zXitcR1QjHyh6WocukgshSR7IoAVg7MJQG-HNh96Jec/edit#gid=421374962
This is a placeholder issue to remind us to merge NewStdAPI to master after #41 is complete.
From @saramsey on November 17, 2017 20:15
We're getting a huge expansion when we map disease->phenotype. Are some of these super-general phenotypes that are high on the phenotype ontology tree? Not sure if that is a problem.
Copied from original issue: dkoslicki/NCATS#36
From @edeutsch on November 27, 2017 17:31
At the moment, the web service returns text blobs that are human readable and (reasonably) nicely displayable in the UI, but they are not really nicely structured data in a way that 3rd party software would like. We should upgrade to return both.
Copied from original issue: dkoslicki/NCATS#116
do this in the same branch where the rest of the KG development work will be done
python3 Q1Solution.py -j -i 'DOID:12365'
Traceback (most recent call last):
File "Q1Solution.py", line 270, in
main()
File "Q1Solution.py", line 259, in main
res = answerQ1(disease, directed=directed, max_path_len=max_path_len, verbose=verbose, use_json=use_json)
File "Q1Solution.py", line 185, in answerQ1
omims = Q1Utils.refine_omims_well_studied(omims, doid, omim_to_mesh, q1_doid_to_mesh, verbose=verbose)
File "/mnt/data/orangeboard/production/RTX/code/reasoningtool/QuestionAnswering/Q1Utils.py", line 489, in refine_omims_well_studied
res = QueryNCBIeUtils.QueryNCBIeUtils.normalized_google_distance(omim_mesh, q1_doid_to_mesh[doid])
TypeError: unhashable type: 'list'
Originally proposed by @dkoslicki:
Related question: is there any way to convert between clinvarid and meshid/term? I see get_clinvar_uids_for_disease_or_phenotype_string, but I don't see any method to convert clinvar to mesh.
Steve says:
It appears not, but it looks like we can connect from ClniVar OrphaNet and from OrphaNet we can get a MeSH ID.
@saramsey @edeutsch To test the robustness of our new integration scheme, I've implemented a new question type: "What diseases are similar to X?" where X is a disease (see commits 5d9ab3e and a5f6e36). I've created this in a new branch (called Q4), and after we've merged NewStdAPI into master, we can see how straightforward it will be to implement this new question type into the UI, and then merge Q4 into master as well. Then we'll be half way to the minimal viable product question count!
As for details of the implementation, after a bit of cypher jiggering (8a70291 and a72ac90), I am able to count the number of nodes (of a certain type) that are shared between any two node types in the graph. This allows me to compute the Jaccard index between the number of phenotypes shared in common between two diseases (and gives an informative error if a disease has no phenotypes but it's parent does). I then return all diseases which have a large enough Jaccard index (given by the --threshold parameter with default Jaccard=0.20).
@edeutsch This question again returns each disease as a separate node (as in #41). I can change this when requested.
From @saramsey on November 27, 2017 16:41
Copied from original issue: dkoslicki/NCATS#115
From @saramsey on December 9, 2017 1:13
Article describing the REST API for JASPAR is on bioArxiv:
https://www.biorxiv.org/content/early/2017/07/06/160184
Copied from original issue: dkoslicki/NCATS#126
@edeutsch I'm trying to get Q2 hooked up with RTXQuery using the new std api. However, even though it gives the same output as Q3 (which works with RTXQuery): printed JSON text, there seems to be an issue with parsing it:
python3 ../../UI/OpenAPI/python-flask-server/RTXQuery.py
python3 Q2Solution.py -r 'physostigmine' -d 'DOID:1686'
Traceback (most recent call last):
File "../../UI/OpenAPI/python-flask-server/RTXQuery.py", line 119, in <module>
if __name__ == "__main__": main()
File "../../UI/OpenAPI/python-flask-server/RTXQuery.py", line 114, in main
result = rtxq.query(query)
File "../../UI/OpenAPI/python-flask-server/RTXQuery.py", line 69, in query
response = json.loads(returnedText)
File "/usr/lib/python3.5/json/__init__.py", line 312, in loads
s.__class__.__name__))
TypeError: the JSON object must be str, not 'CompletedProcess'
Noting that python3 Q2Solution.py -r 'physostigmine' -d 'DOID:1686'
executes without issue. I assume this is a problem with the return
of Q2, but I can't seem to see how this differs from the working Q3.
Apparently QueryNCBIeUtils doesn't have sufficient error handling:
>>>python3 Q2Solution.py -r mupirocin -d 'DOID:1563'
HTTP timeout in QueryNCBIeUtils.py; URL: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term=%28CASP1_HUMAN%7CCaspase-1%20%5BMeSH%20Terms%5D%7CCASP-1%5BMeSH%20Terms%5D%7CEC%203.4.22.36%5BMeSH%20Terms%5D%7CInterleukin-1%20beta%20convertase%5BMeSH%20Terms%5D%7CIL-1BC%5BMeSH%20Terms%5D%7CInterleukin-1%20beta-converting%20enzyme%5BMeSH%20Terms%5D%7CICE%5BMeSH%20Terms%5D%7CIL-1%20beta-converting%20enzyme%5BMeSH%20Terms%5D%7Cp45%7C%20%5BCleaved%20into%3A%20Caspase-1%20subunit%20p20%3B%20Caspase-1%20subunit%20p10%5D%7CCASP1%5BMeSH%20Terms%5D%7CIL1BC%5BMeSH%20Terms%5D%7CIL1BCE%5BMeSH%20Terms%5D%29%20AND%20%28Deubiquitination%29&retmode=json&retmax=1000
Traceback (most recent call last):
File "Q2Solution.py", line 223, in <module>
main()
File "Q2Solution.py", line 218, in main
res = answerQ2(drug, disease, k)
File "Q2Solution.py", line 128, in answerQ2
mesh1=False, mesh2=False)
File "/home/dkoslicki/Dropbox/Repositories/RTX/code/reasoningtool/QueryNCBIeUtils.py", line 243, in normalized_google_distance
numerator = max(math.log(ni), math.log(nj)) - math.log(nij)
TypeError: a float is required
I assume the way to go is either:
https://reactome.org/ContentService/interactors/static/molecule/P35346/details
Traceback (most recent call last):
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "/usr/lib/python3.5/timeit.py", line 213, in timeit
return Timer(stmt, setup, timer, globals).timeit(number)
File "/usr/lib/python3.5/timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "", line 6, in inner
File "BuildMasterKG.py", line 371, in
running_time = timeit.timeit(lambda: run_function(), number=1)
File "BuildMasterKG.py", line 337, in make_master_kg
seed_and_expand_kg_q2(num_expansions=3)
File "BuildMasterKG.py", line 250, in seed_and_expand_kg_q2
bne.expand_all_nodes()
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 343, in expand_all_nodes
self.expand_node(node)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 331, in expand_node
expand_method(self)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/BioNetExpander.py", line 220, in expand_uniprot_protein
int_dict = QueryReactome.query_uniprot_id_to_interacting_uniprot_ids_desc(uniprot_id_str)
File "/mnt/data/orangeboard/code/NCATS/code/reasoningtool/QueryReactome.py", line 154, in query_uniprot_id_to_interacting_uniprot_ids_desc
res = QueryReactome.send_query_get('interactors/static/molecule', uniprot_id + '/details').json()
AttributeError: 'NoneType' object has no attribute 'json'
I am trying to improve our answers to the Q2 COP's. To do this, I need the Google distance to work on all nodes in our KG. Currently, in QueryNCBIeUtils.py, there is a way to get mesh term ID's from OMIM ID's and from disease names, but I also need to get mesh term ID's from:
I don't think I need it for phenotypes since the KG descriptions are all showing up in MeSH as far as I can tell.
The need stems from the fact that currently the Google distance doesn't work on "P23219" and "Prostaglandin G/H synthase 1" (one of its synonyms), but does work for another synonym "Cyclooxygenase-1".
I tried poking around on the uniprot API, and the only way I could see to do this is to download:
https://www.uniprot.org/uniprot/P23219.xml (or https://www.uniprot.org/uniprot/P23219.txt)
and then pull off all the names and hit them with QueryNCBIeUtils.is_mesh_term. But I assume there's a much better way to do this via API call.
From @dkoslicki on November 26, 2017 23:9
Since I don't have any "ground truth" for these, those with more bio knowledge, please make sure they look plausible/are correct.
You can run it yourself on all Q2 drugs/diseases via:
git pull
python3 Q2Solution.py -a
Copied from original issue: dkoslicki/NCATS#107
The new QuestionTranslator gets wedged on this question:
What is the clinical outcome pathway of dicumarol for treatment of coagulation?
n.name and n.description are switched for pharos_drugs. i.e. we want the CHEMBL ID in the name, the common name in the description (as it is with the other nodes).
Hi Yao & Zheng,
You know how each of the “QueryXXXXX.py” modules has test code? Can you update all these modules so that the test methods can be run (all at once) as a set of regression tests, perhaps using the ‘test.regrtest’ or ‘unittest” package?
read the long thread about it here:
dkoslicki/NCATS#88
QueryNCBIeUtils is printing out a list of ints, causing problems with the JSON parsing of QXSolution.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.