Comments (7)
Hi,
thanks for pointing this out.
the -s defines the minimum number of reads supporting an event before it is called. Thus -s2 should give the same results then -s 5.
I usually use -s 10 to get a robust set of SV calls. Apart from that everything looks good. Would you mind sharing the bam file such that I can take a look and also check if there is something going on with NGMLR?
What might interest you is also the -n parameter. This gives you the read names that support the deletion. I often use this to check strange SV in IGV.
Thanks,
Fritz
from sniffles.
Hi,
sure, here are:
- the BAM file: https://ws.molgen.mpg.de/ws/493381/ngmlr.chained2.sorted.bam
- the location of the simulated deletions (the insertions in the file are regions that have been moved to a different region so that they are, for this purpose, deletions, too): https://ws.molgen.mpg.de/ws/005664/deletions_and_insertions_from.bed
- the Sniffles calls: https://ws.molgen.mpg.de/ws/263535/sniffles.ngmlr.s5.vcf
Thanks for the hint with the -n parameter! It turns out that (at least for one big detected deletion I have looked at) there are reads that map to both sides of the SV. That would be all fine if the two mapping locations wouldn't be so far apart. But maybe you can find out more with the files :)
Cheers, David
from sniffles.
Hi David,
yes, I was more expecting an alignment artifact. NGMLR improves things a lot, but we still see sometimes noisy events... Thats why I keep the -s to 10 (default) most of the time...
Thanks for the files.
Fritz
from sniffles.
Hi Fritz,
I would like to hear your opinion on the following scenario: Let's say there is an intra-chromosomal duplication (e.g. a LINE) that copied the sequence of chr1:44,000,000-44,007,000 to another place in that same chromosome (e.g. at chr1:100,000). And let's say there is a read covering that insertion location such that the two read tails map left and right of the insertion location but the middle part maps to the source region of the duplication. Would Sniffles call a deletion between chr1:100,000-44,000,000?
I'm asking because I think this might be behind some of the very large deletion calls described above. I'm seeing this not only for simulated data: From the NA12878 PacBio data, Sniffles (with -s 10) reports 175 deletions >1Mb with the largest having a size of 78 Mb.
Cheers
David
from sniffles.
Hi David,
thanks for reaching out. Yes that is indeed a problem. I looked into this for a long time and did not find a satisfying solution. The DEL or INV if they are very large can indicate exactly this.
The problem is that there is no difference in the signal for me apart from the length. Thus, I did not want to set an arbitrary threshold.
I hope that helps
Fritz
from sniffles.
Thanks for the quick reply, Fritz.
It is indeed hard to distinguish from the signal except when you look at all read segments at once. But even then, there might be cases where it's very hard to tell apart.
Best
David
from sniffles.
No problem.
I hope that helps.
Thanks
Fritz
from sniffles.
Related Issues (20)
- Overcalling during force calling HOT 3
- The index file is older than the data file: alignment.sorted.bam.bai HOT 4
- [Report] Significant inflation of INS when joint calling with Sniffles 2.3.3 HOT 3
- Significantly reduced DEL SUPPORT HOT 5
- vcf results are different between one-step direct calling and snf-vcf calling HOT 1
- Detection of Reciprocal BNDs
- How to turn off joint genotyping in multisample calling
- Phailed SV phasing
- tandem repeat annotation for another human genome resource HOT 1
- Installation issue on apple M2 HOT 10
- Segmentation fault: 11 on modbam files (ONT) HOT 2
- Does sniffles work on haploid genomes
- variant allele frequency adjustment HOT 1
- Does Sniffles need cs tag HOT 1
- Difference between SUPPORT, DV and RNAMES HOT 1
- Invalid END position
- Can't run _any_ version > 2.2 without an Exception. HOT 3
- Why the longest insertion is very short HOT 2
- <INS> sequence missing in ALT field and represented as C<INS>, T<INS>, A<INS>, G<INS>
- Are calls made from supplementary alignments?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sniffles.