Code Monkey home page Code Monkey logo

Comments (5)

jeizenga avatar jeizenga commented on September 26, 2024

Although vg sim can run with long read input, it's really designed for short reads. If you use it to generate long reads, you won't get very realistic errors or a realistic read length distribution. In our own testing and development, we've used pbsim to simulate long reads. You would probably want to generate the reads from FASTAs of sample haplotypes, rather than directly from the GBZ file.

from vg.

tanger-code avatar tanger-code commented on September 26, 2024

Thank you!
And can I use vg sim and the .gbz file to generate short reads using vg sim -x graph.xg **-g graph.gbz** -m SAMPLE -n 1000 -l 150 -a > SAMPLE.gam ?
Now I have the .gbz file of all chromosomes pangenome graph. And I want to generate short reads only for chr21. Do I need to withdraw the .gbz file of chr21? I don't find Related command.

from vg.

tanger-code avatar tanger-code commented on September 26, 2024

Although vg sim can run with long read input, it's really designed for short reads. If you use it to generate long reads, you won't get very realistic errors or a realistic read length distribution. In our own testing and development, we've used pbsim to simulate long reads. You would probably want to generate the reads from FASTAs of sample haplotypes, rather than directly from the GBZ file.

I'm simulating long reads using pbsim3 and the output is .maf file. If I want to do some simulation experiment such as calling SV based on the simulation reads, can I use the maf file as the truth set? Or use some public truth set?

Do you have any suggestions?

from vg.

jeizenga avatar jeizenga commented on September 26, 2024

Looking through our script, it seems that we used the maf2sam subcommand of bioconvert.

from vg.

adamnovak avatar adamnovak commented on September 26, 2024

@tanger-code If you want to simulate from just one named path in the graph, you can use the -P option to vg sim.

But that simulates from just that path; it won't include variants in the graph that leave the embedded path.

I don't think we have a way to simulate from the connected component of the graph that contains a path, other than using vg chunk --components -p name-of-path to pull out that subgraph and then simulating from it.

from vg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.