Code Monkey home page Code Monkey logo

msrcall's Introduction

MSRCall

MSRCall: A Multi-scale Deep Neural Network to Basecall Oxford Nanopore Sequences

Yang-Ming Yeh and Yi-Chang Lu

Preparation

Data folder

You can put your own test reads in fast5 format in the dataset folder and modify the path in script run_0_preprocee_testsets.sh
In our paper, we used dataset from Ryan Wick et. al. that can be downloaded from this website: https://doi.org/10.26180/5c5a5fa08bbee

Required packages

Our code is tested on cuda 10.0, cudnn 7.6

Please install ctcdecode from: https://github.com/parlance/ctcdecode.git
Install required python packages:

pip install -r requirement.txt

optional software: minimap2

Run

Preprocess test data

bash ./run_0_preprocee_testsets.sh

The preprocessed .npy files are put in the preprocess_test directory.

Run basecalling

python call.py -model exp_backup/MSRCall/MSRCall.chkpt -records_dir preprocessed_test/Acinetobacter_pittii_16_377_0801/ -output MSRCall_out

You can change the test data by replacing Acinetobacter_pittii_16_377_0801 with your own filename.
Basecalled results are stored in the MSRCall_out folder.

Train

Reproduction

You can reproduce the training process by:

python MSRCall_train.py -save_model ${modelName} -as ${train_set_dir}/signals/ -al ${train_set_dir}/labels/ -es ${val_set_dir}/signals/ -el ${val_set_dir}/labels/

As for the training/validation set generation, we follow the same procedures used in SACall. Please refer to scripts for training set in SACall and scripts for validation set in SACall.

References:

Wick, R. R., Judd, L. M., & Holt, K. E. (2019). Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome biology, 20(1), 1-10.
Huang, N., Nie, F., Ni, P., Luo, F., & Wang, J. (2022). SACall: A Neural Network Basecaller for Oxford Nanopore Sequencing Data Based on Self-Attention Mechanism. IEEE/ACM transactions on computational biology and bioinformatics, 19(1), 614โ€“623.

msrcall's People

Contributors

d05943006 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.