Code Monkey home page Code Monkey logo

Comments (6)

shilpagarg avatar shilpagarg commented on August 14, 2024

Thanks for pointing this. Yes, I have seen the invalid pointer error in non-human assemblies. I am working on this and will provide an update soon.

from dipasm.

shilpagarg avatar shilpagarg commented on August 14, 2024

Please try https://pstools.s3.us-east-2.amazonaws.com/pstools_1.

from dipasm.

lukemn avatar lukemn commented on August 14, 2024

Works, thanks!

I get 242 Mb of hap1 and 32 Mb of hap2, and 62 Mb in broken_nodes. hap1 is much bigger than expected, there may be some bacterial contamination in there. I have genetic map-based pseudochromosomes from other assemblies, so I'll go through these files to see what looks sensible.

Also, I guess you plan to get to this eventually, can you say something about what pstools is doing relative to the previous docker pipeline?

Is there good reason not to use the primary hifiasm contigs (or other assemblies), rather than the raw unitigs?

from dipasm.

shilpagarg avatar shilpagarg commented on August 14, 2024

Good to know. The pstools method is purely graph-based without any haplotype collapses and enables routine production of phased sequences. I will be happy to help further if you could send me an email. As I mentioned, I only tested for humans, but it will be interesting to see for other genomes.

Working on unitigs is better than contigs to avoid any random cross-chromosome or long-range chromosome connections. Instead, Hi-C information is powerful to disentangle such cases in the graph.

from dipasm.

shilpagarg avatar shilpagarg commented on August 14, 2024

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

from dipasm.

zhoudreames avatar zhoudreames commented on August 14, 2024

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

I use the pstools_1 agan runining my project,but i got error result ,the length of scaffold_0l_hap1 is ~1.5G ,longer than the biggest chromsome length(~300Mb),this why?
image

from dipasm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.