Code Monkey home page Code Monkey logo

Comments (3)

fritzsedlazeck avatar fritzsedlazeck commented on September 15, 2024

Dear Sangjin Lee,
I think that is a clever way.
So with the 90x coverage you can find often artifacts that simply accumulate. They will likely be marked as IMPRECISE and often have a genotype of 0/0 indicating a very low support or frequency of reads supporting them. The AF= tag gives you the allele frequency of each event.
Without knowing your sample I assume you will find alot of false insertion and translocation calls.
Translocation can indicate some evidence that some contigs/scaffolds could further be joined.
Insertions accumulate sometimes due to a base calling error/ polymerase stuttering during the sequencing. PacBio is working on it and it should get improved over the next version of the basecaller.

0/1 SVs can happen as these are the heterozygous SVs. As you know most of the assemblers only report one sequence as a representation.

So what I would do is to focus on the 1/1. These should either mark problems from NGMLR+ sniffles or form the assembly. You might want to pinpoint these regions and also look at the short read evidence for these regions.

I hope this is helping you. Otherwise feel free to ping me and we can chat.
Cheers
Fritz

from sniffles.

sjin09 avatar sjin09 commented on September 15, 2024

Hey Fritz,

Thank you for the swift response. This is a really helpful response. I would love to be able to discuss this further with you, but I would to do so in private as the data is unpublished. If there are any additional problems, can I send you an email through your baylor college of medicine email address?

I will concentrate on the homozygous genotypes for now, visualize the long-read bam file and short-read bam file using IGV. I guess I could potentially increase the number of minimum reads required for genotyping to reduce the number of false positive variants.

Best,
Sangjin Lee

from sniffles.

fritzsedlazeck avatar fritzsedlazeck commented on September 15, 2024

Yes feel free to ping me. The fastest is over my gmail: [email protected] ,but the bcm also works.

I think it would be great to stick the heads together and come up with a small pipeline that can automatically assess this. Just an idea, but I am happy to discuss further options.
Cheers
Fritz

from sniffles.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.