suihuanglab / improving-agent Goto Github PK

2.0 2.0 4.0 798 KB

An Autonomous Reasoning Agent in the NCATS Biomedical Translator Network that uses EHR and multi-omic cohorts to rank results from user queries

License: MIT License

Dockerfile 0.08% Python 90.61% CSS 0.34% JavaScript 5.46% HTML 3.44% Shell 0.07%

ncats-translator

improving-agent's Issues

tests

Write tests to cover existing code and define a framework and standards that PRs must meet in re: testing

Move to ITRB

Per @cbizon

Please move production to ITRB prod asap
Also we will want ITRB testing and staging soon

Support for directed edges

Surfaced here: NCATSTranslator/minihackathons#41

In an earlier time when we wanted to expose as many results as possible, we wrote our query logic to make all edges undirected. This usually wasn't an issue given a predicate and two different node types, however this query shows an issue in which the edge direction is important for two nodes of the same type.

Supporting this requires some additional query logic to be written in the TRAPI to Cypher translation. Alternatively, we could leave that logic alone and filter results that don't match the intended edge direction. The former case would be cleaner and more computationally efficient, but could lead to issues where Biolink modeling does not agree with SPOKE modeling. The latter may be the simpler, though less efficient approach.

Node normalization

Currently, we only accept CURIEs that are native to SPOKE. We need set up our client for requesting from SRI resources to resolve and map CURIEs from other dbs.

Support qNodes with ids but no categories

As surfaced here NCATSTranslator/minihackathons#11 , the imProving Agent does not support qNodes with qNode IDs but no categories.

This was an intentional design decision that was used for retrieving relevant node-specific identifier regexes to prevent Cypher injection and to skip querying the node normalizer if the identifier is recognized. However, this could be updated to always query the node normalizer regardless and the resulting node category from the node normalizer would allow the retrieval of the relevant regex.

Improves returning results with scores 0 value

ARS has noticed that improving is sending results with score value of 0.
this is the parent pk we encounter the issue
https://ars.test.transltr.io/ars/api/messages/d3a8e46d-56c4-42bd-a76e-d8a073fc5b88?trace=y

disease identifier (minor issue)

If I query: What disease { cond_assoc_w_gene | gene_assoc_w_cond | genetic_association } with gene MUC5B (727897), then all ARA's return an important answer: IPF, Mondo ID 0008345. However, ImProving returns a level below IPF, which is ILD2, Mondo ID 0800029. Is this as expected? It cannot be de-duped with all the other ARA results of IPF.

@brettasmi

ARS Integration

Improving Agent does not currently respond to queries sent through the ARS. I believe this may be an issue related to TRAPI versioning
https://ars.transltr.io/ars/api/messages/9d330568-6e65-4e23-8465-b66fb321c955

SSL Certificate is expired on dev

I am getting SSL certificate errors when sending queries to https://spokekp.healthdatascience.cloud/api/v1.4/query

scoring issue with MVP1 post KL/AT updates

From Sarah:
it looks up in Test, i see scores and subgraphs for MVP2/genes but I'm not seeing scores for MVP1, is that expected?

here is the MVP1/Disease 28af66c5-0292-4d9a-9273-f92938ffe052

TRAPI 1.0

Bring imProving Agent to TRAPI 1.0 compliance

Integrate COHD

The COHD KP offers easily ingestible data on clinical co-occurrence for concepts in the EHR. The data are encoded with OMOP, which should make for an easy mapping to SPOKE with existing code or the SRI node-normalization service.

Docker and CI/CD

fully Dockerize the (im)PROVE agent
-- SPOKE is currently running in Docker, but we need to get nginx, uwsgi, and the actual evidARA Python code into images as well
decide and implement a CI/CD framework
-- evidARA is currently running in a VPC in AWS, so the most likely choice is alongside other services in CodePipeline
-- this could likely exist in the free tier of Github's Actions or CircleCI, so those options will also be evaluated

Broader KP interaction

As surfaced here: NCATSTranslator/minihackathons#71 (comment) we do not query data from the valuable and unique ICEES APIs that provide exposures and other EHR data that could be integrated into our results and ranking algorithm.

Beyond that, there are many KPs that we do not interact with. This issue will track progress on further integration.

Value of the info.x-trapi.test_data_location property for the improving-agent ARA entry in the Translator SmartAPI Registry

The current value (as of August 4th, 2022 morning) of the info.x-trapi.test_data_location property is set for the improving-agent ARA entry in the Translator SmartAPI Registry to a non-existing github target. Please set the URL to a single JSON file resource, preferrably at the end of a https://raw.githubusercontent.com/ endpoint (if in Github. Actually, it can be anywhere on the internet as long as it is a REST accessible JSON file resource).

Investigate intermediate node results membership

From Mark:

We're seeing results come back like this:

"node_bindings": {
              "on": [
                {
                  "id": "NCBIGene:154",
                  "query_id": null,
                  "attributes": null
                }
              ],
              "sn": [
                {
                  "id": "PUBCHEM.COMPOUND:5311065",
                  "query_id": null,
                  "attributes": null
                }
              ],
              "intermediate_gene_96062": [
                {
                  "id": "NCBIGene:348",
                  "query_id": null,
                  "attributes": null
                }
              ]

with the extra "intermediate_gene" node.  Is this intentional?  At the moment, it throws off the ARS merge behavior because we determine whether two results are the same by looking for the set of nodes in the node bindings to match.  So, this one doesn't match with the results returned by other ARAs that are the same, except for lacking the intermediary node.

This is in CI with PK = 21791a56-966b-477b-9f6e-7fb3450387cf

suihuanglab / improving-agent Goto Github PK

improving-agent's Issues

tests

Move to ITRB

Support for directed edges

Node normalization

Support qNodes with ids but no categories

Improves returning results with scores 0 value

disease identifier (minor issue)

ARS Integration

SSL Certificate is expired on dev

scoring issue with MVP1 post KL/AT updates

TRAPI 1.0

Integrate COHD

Docker and CI/CD

Broader KP interaction

Value of the info.x-trapi.test_data_location property for the improving-agent ARA entry in the Translator SmartAPI Registry

Investigate intermediate node results membership

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent