Code Monkey home page Code Monkey logo

jax-cnv's Introduction

JAX-CNV: clinical-graded copy number variation detector

Overview

Here we present JAX-CNV, a newly developed NGS-based CNV algorithm and its performance on WGS data. We focused on deletions and duplications that are >50Kb.

Dependencies

To run the tool, the following dependencies are required

  • Linux/Unix 64-bit
  • gcc version 4.9 or higher
  • zlib
  • autoconf version 2.69 or higher

JAX-CNV depends on the following tools, which are already included in JAX-CNV/lib/

Download and Installation

git clone --recursive https://github.com/TheJacksonLaboratory/genome_similarity.git
cd genome_similarity
make

Usages

Kmer FASTA file preparation

We employ jellyfish to check 25-mer counts and GrabJellyfishKmer to dump a kmer FASTA file. Please check jellyfish for more jellyfish options, such as --threads/-t (Number of threads) and --Files/-F (Number files open simultaneously). -s is for Initial hash size and please adjust it for your machine.

bin/jellyfish count -m 25 -s <INT> -o <FASTA>.jf [-t <INT> -F <INT>] <FASTA>
bin/JAX-CNV GrabJellyfishKmer --ascii -i <FASTA>.jf -f <FASTA> -o <FASTA>.kmer

Detect CNVs

A sorted BAM, FASTA and Kmer are required. The results will be printed on stdout or use -o to specify an output file.

bin/JAX-CNV GetCnvSignal -f <FASTA> -k <FASTA>.kmer -b <BAM> [-o <OUTPUT>]
Rscript --vanilla JaxCNVMerge.R -i <OUTPUT>.bed

.bed.merge.bed is the final result. JaxCNVMerge.R could be also applied for other tools'' bed files.

The complete command line options are:

USAGE: GetCnvSignal -f <FASTA> -k <kmer_table> -b <BAM>

        -h --help                       Print this help list.

Input & Output:
        -b --bam <BAM>                  Input BAM; required.
        -k --kmer <kmer_table>          Kmer table.
        -f --fasta <FASTA>              FASTA for kmer lookup.
        -o --output <FILE>              Output file.

Operations:
        -c --coverage <INT>             The expected coverage.
        -r --region chr:begin-end       A target region.
        -q --aln_qual                   A mapping quality filter for alignments. [40]
        --bin <INT>                     Report a result for each # bp. [50]
        --log <FILE>                    Log output.
        --unique_kmer <FLOAT>           Require percentage of unique kmer to report a CNV. [0.6]
        --kmer_score <FLOAT>            Score for log2(kmer count) = 2 positions. [0.1]

For Dcoker users

Dockerfile is provided. Please notice that sudo may be required for docker usages depending on your machine setting.

cd JAX-CNV
docker build .

JAX-CNV wnd jellyfish will be built on /tools in docker. Or, Pull docker image from wanpinglee/jax-cnv.

docker pull wanpinglee/jax-cnv:latest

jax-cnv's People

Contributors

shuchangliu avatar wanpinglee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

jax-cnv's Issues

no error but result file is empty

Hi @wanpinglee

I download some BAM files which mapped to the GRCh38 standard set.
So, I first process this standard FASTA with Jellyfish:

$ jellyfish bc -m 25 -s 3G -t 8 -o Homo_sapiens_assembly38.bc Homo_sapiens_assembly38.fasta
$ jellyfish count -m 25 -s 1G -t 8 -o Homo_sapiens_assembly38.jf --bc Homo_sapiens_assembly38.bc Homo_sapiens_assembly38.fasta
// output file size
$ ls -lh
total 9.1G
-rw-------. 1 fup compute-jin810 7.9G Aug 13 23:51 Homo_sapiens_assembly38.bc
-rw-------. 1 fup compute-jin810 1.3G Aug 14 00:23 Homo_sapiens_assembly38.jf

Then, ​I ran JAX-CNV GrabJellyfishKmer​:​ (using Docker wanpinglee/jax-cnv:latest)

$ /tools/JAX-CNV/bin-linux/JAX-CNV GrabJellyfishKmer --ascii ​\
​-i Homo_sapiens_assembly38.jf ​\
​-f Homo_sapiens_assembly38.fasta ​\
​-o Homo_sapiens_assembly38.kmer
 
 ​// output file size:​
-rw-r--r--. 1 fup domain users   3.0G Aug 14 02:45 ​​Homo_sapiens_assembly38.kmer
-rw-------. 1 fup compute-jin810  727 Aug 14 01:53 Homo_sapiens_assembly38.kmer.fai

​In this step, my output .fai file were ​interrupt without any error.
So, I index the Homo_sapiens_assembly38.kmer with samtools to get validate fai file.

​$ samtools faidx Homo_sapiens_assembly38.kmer --fai-idx Homo_sapiens_assembly38.kmer.fai​

​Finally, I ran JAX-CNV GetCnvSignal: (using Docker wanpinglee/jax-cnv:latest)

​$ /tools/JAX-CNV/bin-linux/JAX-CNV GetCnvSignal \
-f Homo_sapiens_assembly38.fasta \
-k Homo_sapiens_assembly38.kmer \
-b BAMs/UDN369194-861-05ac424d-0a44-46fe-b6f6-0fcfe0788d2a.bam \
-o jax_cnv_GetCnvSignal_on_UDN369194.result \
--log JAX_CNV_GetCnvSignal_on_UDN369194.log 
Gender: Cannot determine. 
Message: The estimated coverage is -2147483648 
Message: Processing chr1:0-248956421 
Message: Loading chromosome chr1 is done. 
Message: Loading kmer of chromosome chr1 is done.​
​...
Message: Loading chromosome HLA-DRB1*16:02:01 is done.
Message: Loading kmer of chromosome HLA-DRB1*16:02:01 is done.
Message: HMM completes.
1 HLA-DQA1*01:01:02 49 5548 5500
1 HLA-DQA1*01:02:01:01 49 6348 6300
...
1 HLA-DRB1*16:02:01 49 9898 9850
Filter checking for HLA-DQA1*01:01:02 49 5500 
Message: Loading kmer of chromosome HLA-DQA1*01:01:02 is done. 
0 0.0118182 
1 0 
2 0 
3 0.00636364 
4 0 
5 0 
6 0 
7 0 
8 0 
9 0
 0 10 10 0 
Filter 
... 
Filter checking for chr1 10049 125174600 
Message: Loading kmer of chromosome chr1 is done. 
0 0.0306559 
1 0.029093 
2 0.0187111 
3 0.0145986 
4 0.0127288 
5 0.0127871 
6 0.0203479 
7 0.0137128 
8 0.0194234 
9 0.0534288 
0 10 10 0 
Filter​
​...


// OUTPUT:
-rw-------. 1 fup compute-jin810 2.7G Aug 15 01:15 JAX_CNV_GetCnvSignal_on_UDN369194.log 
-rw-------. 1 fup compute-jin810 0 Aug 15 01:12 jax_cnv_GetCnvSignal_on_UDN369194.result​

​I tried 5 different BAMs, and all ​of them had no result. (result file is empty)

And, I'm wondering that the logs shows:

Gender: Cannot determine. 
Message: The estimated coverage is -2147483648 

Why is our BAM file had a huge negative coverage?
And do you have any suggestion for fix this problem? Thank!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.