suihuanglab / improving-agent Goto Github PK
View Code? Open in Web Editor NEWAn Autonomous Reasoning Agent in the NCATS Biomedical Translator Network that uses EHR and multi-omic cohorts to rank results from user queries
License: MIT License
An Autonomous Reasoning Agent in the NCATS Biomedical Translator Network that uses EHR and multi-omic cohorts to rank results from user queries
License: MIT License
Write tests to cover existing code and define a framework and standards that PRs must meet in re: testing
Per @cbizon
Please move production to ITRB prod asap
Also we will want ITRB testing and staging soon
Surfaced here: NCATSTranslator/minihackathons#41
In an earlier time when we wanted to expose as many results as possible, we wrote our query logic to make all edges undirected. This usually wasn't an issue given a predicate and two different node types, however this query shows an issue in which the edge direction is important for two nodes of the same type.
Supporting this requires some additional query logic to be written in the TRAPI to Cypher translation. Alternatively, we could leave that logic alone and filter results that don't match the intended edge direction. The former case would be cleaner and more computationally efficient, but could lead to issues where Biolink modeling does not agree with SPOKE modeling. The latter may be the simpler, though less efficient approach.
Currently, we only accept CURIEs that are native to SPOKE. We need set up our client for requesting from SRI resources to resolve and map CURIEs from other dbs.
As surfaced here NCATSTranslator/minihackathons#11 , the imProving Agent does not support qNodes with qNode IDs but no categories.
This was an intentional design decision that was used for retrieving relevant node-specific identifier regexes to prevent Cypher injection and to skip querying the node normalizer if the identifier is recognized. However, this could be updated to always query the node normalizer regardless and the resulting node category from the node normalizer would allow the retrieval of the relevant regex.
ARS has noticed that improving is sending results with score value of 0.
this is the parent pk we encounter the issue
https://ars.test.transltr.io/ars/api/messages/d3a8e46d-56c4-42bd-a76e-d8a073fc5b88?trace=y
If I query: What disease { cond_assoc_w_gene | gene_assoc_w_cond | genetic_association } with gene MUC5B (727897), then all ARA's return an important answer: IPF, Mondo ID 0008345. However, ImProving returns a level below IPF, which is ILD2, Mondo ID 0800029. Is this as expected? It cannot be de-duped with all the other ARA results of IPF.
Improving Agent does not currently respond to queries sent through the ARS. I believe this may be an issue related to TRAPI versioning
https://ars.transltr.io/ars/api/messages/9d330568-6e65-4e23-8465-b66fb321c955
I am getting SSL certificate errors when sending queries to https://spokekp.healthdatascience.cloud/api/v1.4/query
From Sarah:
it looks up in Test, i see scores and subgraphs for MVP2/genes but I'm not seeing scores for MVP1, is that expected?
here is the MVP1/Disease 28af66c5-0292-4d9a-9273-f92938ffe052
Bring imProving Agent to TRAPI 1.0 compliance
The COHD KP offers easily ingestible data on clinical co-occurrence for concepts in the EHR. The data are encoded with OMOP, which should make for an easy mapping to SPOKE with existing code or the SRI node-normalization service.
fully Dockerize the (im)PROVE agent
-- SPOKE is currently running in Docker, but we need to get nginx, uwsgi, and the actual evidARA Python code into images as well
decide and implement a CI/CD framework
-- evidARA is currently running in a VPC in AWS, so the most likely choice is alongside other services in CodePipeline
-- this could likely exist in the free tier of Github's Actions or CircleCI, so those options will also be evaluated
As surfaced here: NCATSTranslator/minihackathons#71 (comment) we do not query data from the valuable and unique ICEES APIs that provide exposures and other EHR data that could be integrated into our results and ranking algorithm.
Beyond that, there are many KPs that we do not interact with. This issue will track progress on further integration.
The current value (as of August 4th, 2022 morning) of the info.x-trapi.test_data_location
property is set for the improving-agent ARA entry in the Translator SmartAPI Registry to a non-existing github target. Please set the URL to a single JSON file resource, preferrably at the end of a https://raw.githubusercontent.com/ endpoint (if in Github. Actually, it can be anywhere on the internet as long as it is a REST accessible JSON file resource).
From Mark:
We're seeing results come back like this:
"node_bindings": {
"on": [
{
"id": "NCBIGene:154",
"query_id": null,
"attributes": null
}
],
"sn": [
{
"id": "PUBCHEM.COMPOUND:5311065",
"query_id": null,
"attributes": null
}
],
"intermediate_gene_96062": [
{
"id": "NCBIGene:348",
"query_id": null,
"attributes": null
}
]
with the extra "intermediate_gene" node. Is this intentional? At the moment, it throws off the ARS merge behavior because we determine whether two results are the same by looking for the set of nodes in the node bindings to match. So, this one doesn't match with the results returned by other ARAs that are the same, except for lacking the intermediary node.
This is in CI with PK = 21791a56-966b-477b-9f6e-7fb3450387cf
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.