The microRNAs (miRNAs) play crucial roles in many biological processes involved in diseases and miRNAs function with protein coding genes (PCGs). In this study, we present a semi-supervised multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations by integrating disease hierarchy into graph convolutional network (GCN). DimiG is then trained on a graph, which is further used to score associations between diseases and miRNAs.
- sklearn
- GCN
- PyTorch 0.4 or 0.5
- Python 2.7
Here we modified the orginal GCN (https://github.com/tkipf/pygcn) to support multi-label learning.
python setup.py install
We aleardy uploaded some data used in this study to the repository under the directory data/, and other big files can be accessed as belows:
- PCG-PCG interaction file "9606.protein.links.v10.txt.gz" can be downloaded from STRING v10 database.
- Disease-PCG assications file "human_disease_integrated_full.tsv" can be downloaded from DISEASES database. We also upload the file human_disease_integrated_full.zip in this repository, please decompress it at directory data/.
- PCG-miRNA interaction file "9606.v1.combined.tsv.gz" can be downloaded from RAIN v1.0 database.
- GTEx_Analysis_2016-01-15_v7_RNASeQCv1.1.8_gene_median_tpm.gct.gz from GTEx website
- gencode.v19.genes.v7.patched_contigs.gtf.gz from GTEx website
- The above five files need be saved at dir "data/".