Comments (10)
Hi there,
Thanks for bringing this up. I am not sure what went wrong since the log doesn't indicate any error. I suggest you run minimap2 outside of ragtag and maybe that will give some indication of what is going on. Just run the following command:
minimap2 -ax sr -t 2 /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta /xxx/sp/01_cleaned/C1_1_trim.fq.gz > /xxx/sp/03_RagTag/C1_correct/c_reads_against_query.sam 2> /xxx/sp/03_RagTag/C1_correct/c_reads_against_query.sam.log
Let me know if it fails again or if it runs to completion.
Thanks
from ragtag.
Hi,
thanks!
You were right: I've tried with another version of minmap2 installed on our cluster and it works (v2.13 vs v2.15 in my first attempt).
nb: not saying here that v2.15 wouldn't work with ragtag.
I have then tried to run 'ragtag correct' without the .gff file and it worked again.
Then I I've tried to run it with the 2 read files I have (forward and reverse reads; as opposed to only the forward reads in my previous attempts) using the -F option pointing to a txt file containing the names of my 2 fastq files but it stopped halfway:
Tue Aug 11 16:00:28 2020 --- RagTag v1.0.0
Tue Aug 11 16:00:28 2020 --- CMD: /xxx/anaconda3/bin/ragtag_correct.py 1-Genome_assembly.fa C1_k105_scaffolds.fasta -t 2 -T sr -F list_files.txt -u -o C1_correct_no_gff_and_list
Tue Aug 11 16:00:28 2020 --- Mapping the query genome to the reference genome
Tue Aug 11 16:00:28 2020 --- Running: minimap2 -x asm5 -t 2 /xxx/sp/03_RagTag/1-Genome_assembly.fa /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta > /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_query_against_ref.paf 2> /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_query_against_ref.paf.log
Tue Aug 11 16:00:42 2020 --- Finished running : minimap2 -x asm5 -t 2 /xxx/sp/03_RagTag/1-Genome_assembly.fa /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta > /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_query_against_ref.paf 2> /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_query_against_ref.paf.log
Tue Aug 11 16:00:42 2020 --- Reading whole genome alignments
Tue Aug 11 16:00:43 2020 --- Filtering and merging alignments
Tue Aug 11 16:00:43 2020 --- Validating putative query breakpoints via read alignment.
Tue Aug 11 16:00:43 2020 --- Aligning reads to query sequences.
Tue Aug 11 16:00:43 2020 --- Running: minimap2 -ax sr -t 2 /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta /xxx/sp/01_cleaned/C1_1_trim.fq.gz /xxx/sp/01_cleaned/C1_2_trim.fq.gz > /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam 2> /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam.log
Tue Aug 11 16:00:48 2020 --- Finished running : minimap2 -ax sr -t 2 /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta /xxx/sp/01_cleaned/C1_1_trim.fq.gz /xxx/sp/01_cleaned/C1_2_trim.fq.gz > /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam 2> /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam.log
Tue Aug 11 16:00:48 2020 --- Compressing, sorting, and indexing read alignments
Tue Aug 11 16:00:48 2020 --- Indexing read alignments
Tue Aug 11 16:00:48 2020 --- Validating putative query breakpoints
Tue Aug 11 16:00:48 2020 --- Calculating global read coverage
Traceback (most recent call last):
File "/xxx/anaconda3/bin/ragtag_correct.py", line 645, in
main()
File "/xxx/anaconda3/bin/ragtag_correct.py", line 610, in main
ctg_breaks = validate_breaks(ctg_breaks, output_path, num_threads, overwrite_files, val_min_break_end_dist, max_cov, min_cov, window_size=val_window_size, clean_dist=min_break_dist, debug=debug_mode)
File "/xxx/anaconda3/bin/ragtag_correct.py", line 168, in validate_breaks
glob_med = get_median_read_coverage(output_path, num_threads, overwrite_files)
File "/xxx/anaconda3/bin/ragtag_correct.py", line 124, in get_median_read_coverage
raise ValueError()
ValueError
Indeed in this case the output file 'c_reads_against_query.s.bam.stats' does not have any line starting from 'COV' ... hence the error message I believe.
Perhaps am I not using courtly the option -F ?
best,
Romain
from ragtag.
looking at the output file 'c_reads_against_query.sam.log' (almost empty), I can see that no read has been mapped.
from ragtag.
Hi there,
Thanks for these details. Unfortunately, this doesn't appear to be a problem with RagTag, but rather with minimap2. As with the first example, the best way to debug is to run the aligner and see why it is not producing alignments. So you could rerun the following:
minimap2 -ax sr -t 2 /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta /xxx/sp/01_cleaned/C1_1_trim.fq.gz /xxx/sp/01_cleaned/C1_2_trim.fq.gz > /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam 2> /xxx/sp/03_RagTag/C1_correct_no_gff_and_list/c_reads_against_query.sam.log
As far as RagTag is concerned, that is a valid minimap2 command.
Let me put it another way: RagTag reports all of the alignment commands used (like the one above). If they fail for some reason, the best way to debug is to run the same command outside of ragtag and reproduce the error. At the end of the day, RagTag will not work if minimap2 isn't working. For example, if minimap2 runs out of memory, then one must focus of resolving that issue with minimap2.
That said, I think RagTag can do a much better job of reporting errors. The value error raised here needs more information. And perhaps it can check if alignment files are empty in order to provide a more useful error message.
Anyways let me know if you can reproduce the error by running minimap2 outside of ragtag.
EDIT
I think RagTag does a good job of reporting when aligner jobs have just failed. But if they fail silently (like producing empty alignment files), RagTag isn't good at reporting those errors.
from ragtag.
Thanks for the reply.
However the minimap2 command line works perfectly -> reads from both files are mapped to the genome.
I tried again to run ragtag after modifying the txt file containing the 2 names files (giving relative paths instead of absolute paths) but again it didn't work.
I'll try to go deeper into this issue but it seems 'somehow' to be a ragtag issue rather than a minimap2 issue.
from ragtag.
From what I understand, the problem seems to come from the way ragtag calls minimap2 with 2 read files, probably in the class Minimap2SAMAligner (Aligner.py file) but I have not detected any mistake.
to recapitulate my observations:
_ when running ragtag withe the -F option pointing to a a text file containing the path of the forward and reverse reads, the minimap2 command is correct (it works when running it) but strangely it doesn't work properly and no read is map.
_ from the file 'c_reads_against_query.sam.log':
[M::mm_idx_gen::3.0561.36] collected minimizers
[M::mm_idx_gen::3.7411.47] sorted minimizers
[M::main::3.7451.47] loaded/built the index for 27604 target sequence(s)
[M::mm_mapopt_update::3.7451.47] mid_occ = 1000
[M::mm_idx_stat] kmer size: 21; skip: 11; is_hpc: 0; #seq: 27604
[M::mm_idx_stat::3.938*1.45] distinct minimizers: 16518461 (95.55% are singletons); average occurrences: 1.091; average spacing: 6.034
ERROR: failed to open file '/xxx/sp/03_RagTag/C1_1_trim.fq.gz /xxx/sp/03_RagTag/C2_1_trim.fq.gz'
[M::main] Version: 2.13-r850
[M::main] CMD: minimap2 -ax sr -t 2 /xxx/sp/03_RagTag/C1_k105_scaffolds.fasta /xxx/sp/03_RagTag/C1_1_trim.fq.gz /xxx/sp/03_RagTag/C2_1_trim.fq.gz
[M::main] Real time: 3.959 sec; CPU: 5.723 sec; Peak RSS: 0.774 GB
It looks as if ragtag is giving minimap2 a single filename consisting on the concatenation of the 2 filenames. Not sure I'm interpreting this correctly (??).
from ragtag.
Hi there,
Great - this log does indicate a RagTag bug. I think your interpretation is correct. I will look into it and fix this bug in the next patch.
This also reinforces the need for better error reporting. That may be a little more nuanced but I will look into it.
As for the first issue, I think I will still assume that there is no bug there.
In the meanwhile, if you are eager for results, you can run minimap2 outside of ragtag (like you have done) and just name the alignments with the expected ragtag name and put them in the output directory. RagTag won't try to overwrite preexisting alignment files, so it should work fine. Let me know if you need more details.
from ragtag.
I think the problem is here:
Lines 584 to 585 in 5df41f1
I try to join the two file names with a space when really they should be separate elements in a list.
from ragtag.
I agree, the problem seems to come from this class object.
Thanks for the tip, it works when running minimap2 before ragtag.
from ragtag.
Fixed in v1.0.1
from ragtag.
Related Issues (20)
- RagTAg correct: "no attribute 'get_reference_length'" HOT 1
- correct use by the species. HOT 1
- Unable to fill gap with ragtag scaffold and patch
- Stop without error message on "Reading whole genome alignments"
- updategff issues
- From delta file to agp and to fasta file of scaffolded assembly?
- `ragtag scaffold` without concatenation
- the result of ragtag.merge.fasta shows no improvement.
- Scaffolding inserts gaps that aren't covered by HiFi or ONT reads and aren't in reference HOT 7
- Can't use multiple threads via -t HOT 3
- Scaffold longer than reference genome due to NNNNs HOT 3
- pysam error
- Error when running RagTag patch command HOT 1
- patch output genome is identical to input target genome
- ERROR: encountered an invalid zero or negative numeric AGP field. HOT 1
- Ragtag Merge IndexError for -f AGP file
- Using RagTag to close the gaps in a chromosome or reference genome using long-read data
- The scaffold command was submitted and the assembled result was not available HOT 1
- output file not found
- ragtag_scaffold.py: error: unrecognized arguments
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ragtag.