Hi Fritz, I applied Sniffles (1.0.6) on one simulated dataset of min

Hi, sure, here are: the BAM file: <a href="https://ws.molg

extremely long SVs detected (>20Mb) about sniffles HOT 7 CLOSED

fritzsedlazeck commented on August 12, 2024

extremely long SVs detected (>20Mb)

from sniffles.

Comments (7)

fritzsedlazeck commented on August 12, 2024

Hi,
thanks for pointing this out.
the -s defines the minimum number of reads supporting an event before it is called. Thus -s2 should give the same results then -s 5.

I usually use -s 10 to get a robust set of SV calls. Apart from that everything looks good. Would you mind sharing the bam file such that I can take a look and also check if there is something going on with NGMLR?

What might interest you is also the -n parameter. This gives you the read names that support the deletion. I often use this to check strange SV in IGV.

Thanks,
Fritz

from sniffles.

eldariont commented on August 12, 2024

Hi,
sure, here are:

the BAM file: https://ws.molgen.mpg.de/ws/493381/ngmlr.chained2.sorted.bam
the location of the simulated deletions (the insertions in the file are regions that have been moved to a different region so that they are, for this purpose, deletions, too): https://ws.molgen.mpg.de/ws/005664/deletions_and_insertions_from.bed
the Sniffles calls: https://ws.molgen.mpg.de/ws/263535/sniffles.ngmlr.s5.vcf

Thanks for the hint with the -n parameter! It turns out that (at least for one big detected deletion I have looked at) there are reads that map to both sides of the SV. That would be all fine if the two mapping locations wouldn't be so far apart. But maybe you can find out more with the files :)

Cheers, David

from sniffles.

fritzsedlazeck commented on August 12, 2024

Hi David,
yes, I was more expecting an alignment artifact. NGMLR improves things a lot, but we still see sometimes noisy events... Thats why I keep the -s to 10 (default) most of the time...

Thanks for the files.
Fritz

from sniffles.

eldariont commented on August 12, 2024

Hi Fritz,

I would like to hear your opinion on the following scenario: Let's say there is an intra-chromosomal duplication (e.g. a LINE) that copied the sequence of chr1:44,000,000-44,007,000 to another place in that same chromosome (e.g. at chr1:100,000). And let's say there is a read covering that insertion location such that the two read tails map left and right of the insertion location but the middle part maps to the source region of the duplication. Would Sniffles call a deletion between chr1:100,000-44,000,000?

I'm asking because I think this might be behind some of the very large deletion calls described above. I'm seeing this not only for simulated data: From the NA12878 PacBio data, Sniffles (with -s 10) reports 175 deletions >1Mb with the largest having a size of 78 Mb.

Cheers
David

from sniffles.

fritzsedlazeck commented on August 12, 2024

Hi David,
thanks for reaching out. Yes that is indeed a problem. I looked into this for a long time and did not find a satisfying solution. The DEL or INV if they are very large can indicate exactly this.

The problem is that there is no difference in the signal for me apart from the length. Thus, I did not want to set an arbitrary threshold.

I hope that helps
Fritz

from sniffles.

eldariont commented on August 12, 2024

Thanks for the quick reply, Fritz.
It is indeed hard to distinguish from the signal except when you look at all read segments at once. But even then, there might be cases where it's very hard to tell apart.

Best
David

from sniffles.

fritzsedlazeck commented on August 12, 2024

No problem.
I hope that helps.
Thanks
Fritz

from sniffles.

extremely long SVs detected (>20Mb) about sniffles HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent