Code Monkey home page Code Monkey logo

prismnet's Introduction

PrismNet

This is a PyTorch implementation of our paper:

Predicting dynamic cellular protein-RNA interactions using deep learning and in vivo RNA structure

Lei Sun*, Kui Xu*, Wenze Huang*, Yucheng T. Yang*, Pan Li, Lei Tang, Tuanlin Xiong, Qiangfeng Cliff Zhang

*: indicates equal contribution.

Cell Research Version: (https://www.nature.com/articles/s41422-021-00476-y)

bioRxiv preprint: (https://www.biorxiv.org/content/10.1101/2020.05.05.078774v1)

prismnet

Table of Contents

Getting started

Requirements

  • Python 3.6
  • PyTorch 1.1.0, with NVIDIA CUDA Support
  • pip

Installation

Clone repository:

git clone https://github.com/kuixu/PrismNet.git

Install packages:

cd PrismNet
pip install -r requirements.txt
pip install -e .

Datasets

Prepare the datasets

Scripts and pipeline are in preparing, currently, we provide 172 samples data in *.tsv format for training and testing PrismNet.

# Download data
cd PrismNet/data
wget https://zhanglabnet.oss-cn-beijing.aliyuncs.com/prismnet/data/clip_data.tgz
tar zxvf clip_data.tgz

# Generate training and validation set for binary classification
cd PrismNet
tools/gdata_bin.sh

Usage

Network Architecture

prismnet

Training

To train one single protein model from scratch, run

exp/EXP_NAME/train.sh pu PrismNet TIA1_Hela clip_data 

where you replace TIA1_Hela with the name of the data file you want to use, you replace EXP_NAME with a specific name of this experiment. Hyper-parameters could be tuned in exp/prismnet/train.sh. For available training options, please take a look at tools/train.py.

To monitor the training process, add option -tfboard in exp/prismnet/train.sh, and view page at http://localhost:6006 using tensorboard:

tensorboard --logdir exp/EXP_NAME/out/tfb

To train all the protein models, run

exp/EXP_NAME/train_all.sh

Evaluation

For evaluation of the models, we provide the script eval.sh. You can run it using

exp/prismnet/eval.sh TIA1_Hela clip_data 

Inference

For inference data (the same format as the *.tsv file used in Datasets) using the trained models, we provide the script infer.sh. You can run it using

exp/prismnet/infer.sh TIA1_Hela /path/to/inference_file.tsv

Compute High Attention Regions

For computing high attention regions using the trained models, we provide the script har.sh. You can run it using

exp/prismnet/har.sh TIA1_Hela /path/to/inference_file.tsv

Compute Saliency

For computing saliency using the trained models, we provide the script saliency.sh. You can run it using

exp/prismnet/saliency.sh TIA1_Hela /path/to/inference_file.tsv

Plot Saliency Image

For plotting saliency image using the trained models, we provide the script saliencyimg.sh. You can run it using

exp/prismnet/saliencyimg.sh TIA1_Hela /path/to/inference_file.tsv 

Motif Construction

For the construction and analysis of integrative motifs, Users can use the scripts in motif_construct/

perl saliency_motif.pl infile.txt sal outfile
Rscript motif_sig.R outfile_motif_summary.txt outfile_motif_sig.txt

Integrative motif

The integrative motif could be downloaded at here.

Half Life Analysis (Example)

Download half life data

cd PrismNet/data
wget http://prismnet.zhanglab.net/data/halflife_data.tgz
tar zxvf halflife_data.tgz

Requirements

pip install xgboost==1.3.0rc1 matplotlib scipy scikit-learn termplotlib

Run Example

exp/logistic_reg/run.sh

Dataset and Results Visualization

We also provide a website http://prismnet.zhanglab.net/ to visualize the icSHAPE date and the results.

Copyright and License

This project is free to use for non-commercial purposes - see the LICENSE file for details.

Reference

@article {Sun2021cr,
	title = {Predicting dynamic cellular protein-RNA interactions using deep learning and in vivo RNA structure},
	author = {Sun, Lei and Xu, Kui and Huang, Wenze and Yang, Yucheng T. and Li, Pan and Tang, Lei and Xiong, Tuanlin and Zhang, Qiangfeng Cliff},
	year = {2021},
	doi = {https://doi.org/10.1038/s41422-021-00476-y},
	journal = {Cell Research}
}
@article {Sun2021cell,
	title = {In vivo structural characterization of the whole SARS-CoV-2 RNA genome identifies host cell target proteins vulnerable to re-purposed drugs},
	author = {Sun, Lei and Li, Pan and Ju, Xiaohui and Rao, Jian and Huang, Wenze and Zhang, Shaojun and Xiong, Tuanlin and Xu, Kui and Zhou, Xiaolin and Ren, Lili and Ding, Qiang and Wang, Jianwei and Zhang, Qiangfeng Cliff},
	year = {2021},
	doi = {https://doi.org/10.1016/j.cell.2021.02.008},
	journal = {Cell}
}

prismnet's People

Contributors

kuixu avatar huangwenze avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.