Code Monkey home page Code Monkey logo

improving-agent's Issues

tests

Write tests to cover existing code and define a framework and standards that PRs must meet in re: testing

Move to ITRB

Per @cbizon

Please move production to ITRB prod asap
Also we will want ITRB testing and staging soon

Support for directed edges

Surfaced here: NCATSTranslator/minihackathons#41

In an earlier time when we wanted to expose as many results as possible, we wrote our query logic to make all edges undirected. This usually wasn't an issue given a predicate and two different node types, however this query shows an issue in which the edge direction is important for two nodes of the same type.

Supporting this requires some additional query logic to be written in the TRAPI to Cypher translation. Alternatively, we could leave that logic alone and filter results that don't match the intended edge direction. The former case would be cleaner and more computationally efficient, but could lead to issues where Biolink modeling does not agree with SPOKE modeling. The latter may be the simpler, though less efficient approach.

Node normalization

Currently, we only accept CURIEs that are native to SPOKE. We need set up our client for requesting from SRI resources to resolve and map CURIEs from other dbs.

Support qNodes with ids but no categories

As surfaced here NCATSTranslator/minihackathons#11 , the imProving Agent does not support qNodes with qNode IDs but no categories.

This was an intentional design decision that was used for retrieving relevant node-specific identifier regexes to prevent Cypher injection and to skip querying the node normalizer if the identifier is recognized. However, this could be updated to always query the node normalizer regardless and the resulting node category from the node normalizer would allow the retrieval of the relevant regex.

disease identifier (minor issue)

If I query: What disease { cond_assoc_w_gene | gene_assoc_w_cond | genetic_association } with gene MUC5B (727897), then all ARA's return an important answer: IPF, Mondo ID 0008345. However, ImProving returns a level below IPF, which is ILD2, Mondo ID 0800029. Is this as expected? It cannot be de-duped with all the other ARA results of IPF.

@brettasmi

scoring issue with MVP1 post KL/AT updates

From Sarah:
it looks up in Test, i see scores and subgraphs for MVP2/genes but I'm not seeing scores for MVP1, is that expected?

here is the MVP1/Disease 28af66c5-0292-4d9a-9273-f92938ffe052

TRAPI 1.0

Bring imProving Agent to TRAPI 1.0 compliance

Integrate COHD

The COHD KP offers easily ingestible data on clinical co-occurrence for concepts in the EHR. The data are encoded with OMOP, which should make for an easy mapping to SPOKE with existing code or the SRI node-normalization service.

Docker and CI/CD

  • fully Dockerize the (im)PROVE agent
    -- SPOKE is currently running in Docker, but we need to get nginx, uwsgi, and the actual evidARA Python code into images as well

  • decide and implement a CI/CD framework
    -- evidARA is currently running in a VPC in AWS, so the most likely choice is alongside other services in CodePipeline
    -- this could likely exist in the free tier of Github's Actions or CircleCI, so those options will also be evaluated

Broader KP interaction

As surfaced here: NCATSTranslator/minihackathons#71 (comment) we do not query data from the valuable and unique ICEES APIs that provide exposures and other EHR data that could be integrated into our results and ranking algorithm.

Beyond that, there are many KPs that we do not interact with. This issue will track progress on further integration.

Value of the info.x-trapi.test_data_location property for the improving-agent ARA entry in the Translator SmartAPI Registry

The current value (as of August 4th, 2022 morning) of the info.x-trapi.test_data_location property is set for the improving-agent ARA entry in the Translator SmartAPI Registry to a non-existing github target. Please set the URL to a single JSON file resource, preferrably at the end of a https://raw.githubusercontent.com/ endpoint (if in Github. Actually, it can be anywhere on the internet as long as it is a REST accessible JSON file resource).

Investigate intermediate node results membership

From Mark:

We're seeing results come back like this:

"node_bindings": {
              "on": [
                {
                  "id": "NCBIGene:154",
                  "query_id": null,
                  "attributes": null
                }
              ],
              "sn": [
                {
                  "id": "PUBCHEM.COMPOUND:5311065",
                  "query_id": null,
                  "attributes": null
                }
              ],
              "intermediate_gene_96062": [
                {
                  "id": "NCBIGene:348",
                  "query_id": null,
                  "attributes": null
                }
              ]

with the extra "intermediate_gene" node.  Is this intentional?  At the moment, it throws off the ARS merge behavior because we determine whether two results are the same by looking for the set of nodes in the node bindings to match.  So, this one doesn't match with the results returned by other ARAs that are the same, except for lacking the intermediary node.

This is in CI with PK = 21791a56-966b-477b-9f6e-7fb3450387cf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.