wikipathways / biothings_explorer_pfocr_prioritization Goto Github PK

PFOCR for prioritization/clustering of BioThings Explorer (BTE) TRAPI results

License: MIT License

Jupyter Notebook 100.00%

biothings_explorer_pfocr_prioritization's Introduction

Current Repository for the WikiPathways Web Site

This repository contains the code and development history of the main web site for the WikiPathways project: wikipathways.org. Built upon MediaWiki, the site includes numerous custom extensions, javascript, skins and hacks.

How to Contribute

For all potential Contributors
The project Roadmap
Our community Code of Conduct
Our organization SOP

Installation

We do not recommend attempting to install this site code as-is. There are many parts and services required that are not included here. Contact one of the architects for more details.

If you do attempt to install, note instructions in the README.md files in each subdirectory of wpi/extensions/, e.g., GPMLConverter and Pathways.

Contributing Pathway Content

If you are interested in adding or editing pathway diagrams, check out these resources:

Old Repo: http://svn.bigcat.unimaas.nl/wikipathways/

biothings_explorer_pfocr_prioritization's People

Contributors

Watchers

biothings_explorer_pfocr_prioritization's Issues

Common names (instead of CURIEs) in heatmaps

The heatmaps with result CURIEs by pathway figures have NCBI Gene ids or MESH ids. These are less easier to remember for the user. Having the gene symbols, chemical names and disease names instead of CURIEs will be more useful.

Suggested by @khanspers.

"Require specified query nodes in pathway results" checkbox not working

This issue was reported by @khanspers on Slack. Below are details of the root cause analysis.

Issue:
For the same results, running the notebook twice with the Require specified query nodes in pathway results set to either true and then false returns the same figures. Is this expected? Meaning in some cases maybe there just aren't any figures that ONLY include the BTE results?

Root cause:
The variable required_curies is not reset in the same session of the notebook.

Steps to replicate:

Run the PET notebook with Require specified query nodes in pathway results as True. The required_curies variable now stores the list of the query nodes with ids.
Run the notebook in the same session (with no reloads or re-opening of the notebook) with Require specified query nodes in pathway results as False, the required_curies variable still has the node ids from the previous run and so same results are generated.

Proposed resolution:
Set required_curies = set() in the Iterative Enrichment of TRAPI Results Using PFOCR Pathway Figures section of the PET notebook before checking for the user-defined value of Require specified query nodes in pathway results.

Alternative to iterative enrichment-with-exclusion

A common solution to the redundancy issue with enrichment results is to filter by Jaccard similarity index. Algorithm:

Calculate enrichment results normally
Decide on max Jaccard threshold for filtering (e.g., 0.5)
Calculate the Jaccard index* between the top ranked pathway and the next pathway in the ranked list of results
If index is > threshold (i.e., too similar), then discard next; if < threshold, then keep.
Continue to evaluate subsequent pathways against each of the retained pathways until you reach desired number of results (n) or run out of pathways.

* Jaccard index formula: count of entities in intersection / count of entities in union

Required CURIE missing from top hit

https://arax.ncats.io/api/arax/v1.3/response/8a8007f6-d02d-4578-81e3-df041ef5541b

The results all should contain the MESH for Alzheimer's, but the first one definitely does not. At least not by eye or by its pfocr web page: https://pfocr.wikipathways.org/figures/PMC5541263__onc2016467f8.html.

First, double check the dropbox pfocr files that are actually used. Maybe it is in there?

Next, check the enrichment code block to see if it is being erroneously inserted somehow.

Not clear what the input TRAPI results URL should look like

It is not clear that a JSON TRAPI result URL is expected by the notebook as input.
Suggested solutions:

In the user input widget, "TRAPI Result URL" field could be renamed to "TRAPI Result URL (JSON)"
At the top of notebook, provide more details that a JSON TRAPI results URL is needed as input. This is not clear from the current description.

Reported by @khanspers

Query examples to test

Try running the notebook for more queries for further testing. Examples:

Original example results from #538:

Imatinib - [Gene] - [Gene] - Asthma: https://arax.ncats.io/api/arax/v1.3/response/49d80ecb-7fd9-4ee6-a642-6d7994903f04 (41 MB, 2862 results)
Imatinib - [Gene] - Asthma: https://arax.ncats.io/api/arax/v1.3/response/7b14f961-9066-41f7-9e3b-d76b2b4a7fac (83kB, 7 results)

Liver injury query from Question of the Month
Alzheimer's disease query from Question of the Month

[Gene] - (related to) - Alzheimer disease: https://arax.ncats.io/api/arax/v1.3/response/8a8007f6-d02d-4578-81e3-df041ef5541b (35.4 MB, 5966 results)

wikipathways / biothings_explorer_pfocr_prioritization Goto Github PK

biothings_explorer_pfocr_prioritization's Introduction

Current Repository for the WikiPathways Web Site

How to Contribute

Installation

Contributing Pathway Content

biothings_explorer_pfocr_prioritization's People

Contributors

Watchers

biothings_explorer_pfocr_prioritization's Issues

Common names (instead of CURIEs) in heatmaps

"Require specified query nodes in pathway results" checkbox not working

More recent TRAPI results cause problems in PET notebook

Alternative to iterative enrichment-with-exclusion

Required CURIE missing from top hit

Not clear what the input TRAPI results URL should look like

Query examples to test

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent