Comments (10)
You might use kmergenie to estimate the „optimal“ k for the Illumina reads then try that instead of the default 49.
from haslr.
@NTNguyen13 thanks for reporting low-sized final assembly, as well as the error with k=19.
I'm going to follow the steps you have done and see if I can reproduce what you get. In general, if the average length of long reads is low I could expect to see more fragmented and lower-sized final assembly (this is because "shorter" long reads might not be able to connect distant unique short read contigs). However, I'm going to check if this is the case here.
With regards to k=19, my suspicion is that contigs generated by Minia are very short and therefore not useful for HASLR.
I'll try both cases and get back to you.
from haslr.
Hi,
I'm highly interested in the answer to this issue as I have exactly the same problem with very small final assemblies.
I would appreciated your help,
Thanks,
Maxime
from haslr.
Thanks! I will try it out. However, I still don't understand why the long read subsampling produces the same file despite using 2 different long read files, or different subsampling threshold
from haslr.
After changing k=19 according to kmergenie, I got another error:
ERROR: "haslr_assemble" returned non-zero exit status
from haslr.
Are you sure kmergenie gave you a k=19? I was expecting something more like k~101 depending on the read length. Not sure about the new error.
from haslr.
NA12878_R1_15X_merge_kmer.dat.pdf
@jelber2 hi, this is the histogram of kmer size
@haghshenas thank you for your support. I suspect that the problem come from using cat
on fastq.gz file of CCS. I tried 3 screnarios:
- #1 Using 1 original read: lr25x.fasta size is ~30GB
- #2 Cat 2 original reads: lr25x.fasta size is ~5.2GB
- #3 Cat all original reads: lr25x.fasta size is ~5.2GB. File content is exactly the same with the #2 scenario.
But I also tried using the lr25x.fasta of #1 scenario with short read k=19, it resulted in
[28-Oct-2020 09:56:26] aligning long reads to short read assembly using minimap2... failed
ERROR: "minimap2" returned non-zero exit status
Edit: I tried with k=49, it still gives non-zero exit status for minimap2
from haslr.
hi @haghshenas, I re-downloaded the Long read file, this time I use sra-toolkit to download the SRA files, then convert them to fastq to make sure all files are well-preserved.
However, the same problem about subsampling long read still persist:
-
- Set --cov-l 10, file size ~26.1 GB
-
- Set --cov-l 25, file size ~26.1 GB
-
- Set --cov-l 30, file size ~26.1 GB
from haslr.
for checking integrity of Long read fastq file, I aligned it to HG38 using minimap2, the coverage is around 30X, as expected.
from haslr.
Did this issue ever get resolved? I am also having very short genome assemblies compared to the reference genome.
from haslr.
Related Issues (20)
- speed running HOT 14
- assembling long reads using HASLR... failed ERROR HOT 9
- Debug Message in log File
- Bug while running script: "could not find file"
- Multiple long read files in the latest version
- Changing K-mer size of Minia leads to final fasta of zero bits
- Failing while checking minia
- question multiple libraries?
- Does this tool do phasing?
- kmer determination
- Much Shorter Assembly than Expected HOT 1
- AttributeError: 'module' object has no attribute 'run' HOT 1
- lr0x.fasta... [ERROR] option -d/--depth is required HOT 1
- new release for bioconda
- Assembling pacbio, nanopore and illumina reads HOT 2
- ERROR: "haslr_assemble" returned non-zero exit status
- Clean raw data prior to assembly
- Assembling the test data does not generate an assembly HOT 2
- final.fa is 0 kb HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from haslr.