mehdiborji / nanoranger Goto Github PK

simplified cellranger for long-read data

License: MIT License

Python 95.21% Shell 4.79%

cellranger minimap2 mixcr nanoranger sc-rna-seq single-cell-rna-seq spatial-transcriptomics tcr-repertoire tcr-seq vdj-recombination

nanoranger's People

Contributors

Stargazers

Watchers

Forkers

avilella wangdi2016 pweidner

nanoranger's Issues

Error in 5p10XTCR example script

When running the 5p10XTCR example in a docker container the pipeline runs until the following point and then fails:

...<lines above cut>...
TRA chains: 156 (55.12%)
TRB chains: 127 (44.88%)
TCR_out/TCR_testrun_bcreads.fasta
TCR_out/TCR_testrun_ref/
Feb 19 22:24:35 ..... started STAR run
Feb 19 22:24:35 ... starting to generate Genome files

genomeGenerate.cpp:150:genomeGenerate: exiting because of *OUTPUT FILE* error: could not create output file TCR_out/TCR_testrun_ref//genomeParameters.txt
Solution: check that the path exists and you have write permission for this file

It looks like there might be an error in genomeGenerate.cpp putting in an additional / character into the path for the output file

Fusion Calling Error

Hello,
Thank you for this tool. I have 5' 10x Library sequenced with Nanopore Sequencing. I previously used JAFFAL to recover known fusion from Single-Cell which works quite well and I wanted to use your fusion detection pipeline using a fasta file to see how it performs with it. However, I encounter this error message on my own data:

alignment to genome and generation of BC-UMI-Transcript tagged BAM 


cores = 20
ref = /home/user/nanoranger/FUSION_SEQUENCE.fa
infile= FUSION_TEST/fusion_deconcat.fastq.gz
outdir = FUSION_TEST
sample = fusion
[M::mm_idx_gen::0.001*1.50] collected minimizers
[M::mm_idx_gen::0.001*5.99] sorted minimizers
[M::main::0.001*5.96] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.001*5.82] mid_occ = 15
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.002*5.70] distinct minimizers: 626 (98.72% are singletons); average occurrences: 1.032; average spacing: 2.913; total length: 1882
[M::worker_pipeline::0.734*16.79] mapped 103327 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -aY --eqx -x splice -t 20 --secondary=no --sam-hit-only /home/user/nanoranger/FUSION_SEQUENCE.fa FUSION_TEST/fusion_deconcat.fastq.gz
[M::main] Real time: 0.738 sec; CPU: 12.330 sec; Peak RSS: 0.053 GB
[bam_sort_core] merging from 0 files and 20 in-memory blocks...
number of genome aligned reads =  4693
10000 barcode candidates processed
20000 barcode candidates processed
30000 barcode candidates processed
40000 barcode candidates processed
50000 barcode candidates processed
60000 barcode candidates processed
70000 barcode candidates processed
80000 barcode candidates processed
number of short UMI reads =  250
20000 Read-BC-UMI-Transcript tuples saved
40000 Read-BC-UMI-Transcript tuples saved
60000 Read-BC-UMI-Transcript tuples saved
rm: cannot remove 'FUSION_TEST/fusion_matching_*': No such file or directory
`

Suprisingly I encounter the same error with the test data

 alignment to genome and generation of BC-UMI-Transcript tagged BAM 


cores = 8
ref = /home/user/nanoranger/data/RUNX1_RUNX1T1_ABL1_BCR.fa
infile= K562_Kasumi1/fusion_deconcat.fastq.gz
outdir = K562_Kasumi1
sample = fusion
[M::mm_idx_gen::0.001*1.89] collected minimizers
[M::mm_idx_gen::0.001*2.31] sorted minimizers
[M::main::0.001*2.30] loaded/built the index for 7 target sequence(s)
[M::mm_mapopt_update::0.002*2.22] mid_occ = 10
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 7
[M::mm_idx_stat::0.002*2.17] distinct minimizers: 2164 (96.63% are singletons); average occurrences: 1.035; average spacing: 2.973; total length: 6656
[M::worker_pipeline::0.050*6.04] mapped 3152 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -aY --eqx -x splice -t 8 --secondary=no --sam-hit-only /home/user/nanoranger/data/RUNX1_RUNX1T1_ABL1_BCR.fa K562_Kasumi1/fusion_deconcat.fastq.gz
[M::main] Real time: 0.050 sec; CPU: 0.303 sec; Peak RSS: 0.010 GB
[bam_sort_core] merging from 0 files and 8 in-memory blocks...
number of genome aligned reads =  2883
number of short UMI reads =  4
rm: cannot remove 'K562_Kasumi1/fusion_matching_*': No such file or directory

Here is my working environment

Minimap2 v2.26-r1175
STAR v2.7.9a
Samtools v1.6

The files present in the output directory for my data so far are :
fusion_barcode_scores.csv fusion_barcode_scores.pdf fusion_bcumi_dedup.csv fusion_BCUMI.fasta.gz fusion_deconcat.fastq.gz fusion_genome_tagged.bam fusion_genome_tagged.bam.bai fusion_knee.pdf fusion_matching.sam fusion_trns_ct.csv

I was looking to have an output file with the reads + barcodes + presence of the fusion, but I'm not sure I've found this in any of these files. Do you have a wiki with the output files created and their content description? I guess I must use the fusion_gene.py in the downstream folder in scripts, but I am unsure of the arguments I need to fill in to use it.
Also related to the script you provide, what is the script performing the extraction of the 10x barcodes? I saw that there are two bash scripts barcode_align.sh and barcode_ref.sh so I imagine those two which are called right ?

Thank you for your help,
Evan

when will the part of the tools for 10x genomics Chromium 3' be updated?

Hi ,thanks for developing such a good tool
and if convenient ,I am looking forward to the part for 10x genomics Chromium 3' library,and very curious about when it will be uploaded.

Thanks!

TCR matching error

When I run 5p10XTCR with the example data TCR3.fastq.gz on my MacOS system, I got the error:
"Traceback (most recent call last):
File "/Users/Home/nanoranger/pipeline.py", line 236, in
utils.process_matching_5p10XTCR(sample,outdir)
File "/Users/Home/nanoranger/utils.py", line 733, in process_matching_5p10XTCR
scores=sort_cnt(all_AS[all_AS[:,1]==0][:,0])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed"

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.