Code Monkey home page Code Monkey logo

simug's People

Contributors

yjx1217 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

simug's Issues

Does simuG generate paired end data?

Hi again,

This is not an issue but a query. Does simuG generate paired end data? And can i specify coverage? Also can i use targeted region? Couldn't find answers to these questions in your manual.

Regards
Mrinal

slow CNV creation

Hi there,

I am giving your tool a try as it looks very simple to run and it seems to do exactly what I want for simulating CNV changes in germline nanopore reads.

I am running with this command:

simuG.pl -r hg38_no_alt.fa -cnv_count 50 -cnv_min_size 500 -cnv_max_size 300000000

But this has been running for 8 days. Do you have any tips to make this run faster?

Changing power-law distribution alpha and constant

Hi,

I am trying to change the power-law distribution alpha and constant values to introduce random indels, but I keep getting the following error for values other than the default values of alpha = 2.0 and constant = 0.5:

Introducing random INDELs based on the following parameters:

indel_count = 300
ins_del_ratio = 1
indel_size_powerlaw_alpha = 3.0
indel_size_powerlaw_constant = 1.0
Argument "9.42357754532298e" isn't numeric in numeric ge (>=) at simuG/simuG.pl line 1242.

I was hoping you could look into it!

Another note, the manual mentions that the power-law-fitted indel size distribution is p = C * (size) ** (alpha) but referring to the simuG script (line 1616) I realized the distribution is actually set to be p = C * (size) ** (-alpha). Maybe this is a typo?

Best,
Meera

Uncertainty in description

Hello!
You have done a great job!
Please take a look at the description of -indel_size_powerlaw_constant option. It seems for you to forget to change the description after copying from previous one.

Best regards
Asan

Use of uninitialized value in substr at simuG.pl line 1164.

I ran the following command with simuG and received the following message in execution and no output is created.

perl simuG.pl -refseq -snp_vcf -prefix

[Thu Jul 21 10:12:15 2022]
Starting simuG ..

[Thu Jul 21 10:12:15 2022]
Check specified options ..
Running simuG for SNP/INDEL simulation >>
Ignore all options for CNV/inversion/translocation simulation.

This simulation use the random seed: 1864782145

The option snp_vcf has been specified: snp_vcf =
Ignore incompatible option: snp_count
Ignore incompatible option: snp_model
Ignore incompatible option: titv_ratio

[Thu Jul 21 10:12:16 2022]
Parsing the input vcf file:
Use of uninitialized value in substr at /scratch/rdeshmuk/simuG/simuG.pl line 1164.
[Thu Jul 21 12:10:13 2022]
Introducing defined SNP/INDELs based on the input vcf file(s):

snp_vcf =

Could you tell me what the issue is and how I can solve this?

Multicore execution

Do you plan to introduce the parallel multiprocessing functionality to simuG?
It worked in single core while I used it.

Use of uninitialized value in substr at [...] line 1158

Hi! I am trying out this package and so far it has been working out smoothly when generating random SNPs. However, when I want to generate SNPs using vcf file I get an error:

simug -refseq ../NC_002945v4.fasta -snp_vcf ABCD_modified.vcf -prefix ABCD

[simug is the alias I use to run the script from my path]

[Tue Jun 16 15:34:24 2020]
Starting simuG ..

[Tue Jun 16 15:34:24 2020]
Check specified options ..
Running simuG for SNP/INDEL simulation >>
Ignore all options for CNV/inversion/translocation simulation.

This simulation use the random seed: 1128527723

The option snp_vcf has been specified: snp_vcf = ABCD_modified.vcf
Ignore incompatible option: snp_count
Ignore incompatible option: snp_model
Ignore incompatible option: titv_ratio

[Tue Jun 16 15:34:25 2020]
Parsing the input vcf file: ABCD_modified.vcf

!!! Warning! Multiple alternative variants found at the same site:
!!! NC_002945.4:3884276 GGGCCGGGGGCGCCGGCGA=>G,GGGCCGGGGGCGCCGGCGG QUAL=0!
!!! Ignore all variants at this site.

[Tue Jun 16 15:34:25 2020]
Introducing defined SNP/INDELs based on the input vcf file(s):

snp_vcf = ABCD_modified.vcf
Use of uninitialized value in substr at /home/victor/simuG/simuG.pl line 1158.

I have also tried pasting the SNPs in the SNP.vcf file in the testing directory, but still get the same error. Even when I delete the duplicate variant, I still get the same error:

snp_vcf = ABCD_modified.vcf
Use of uninitialized value in substr at /home/victor/simuG/simuG.pl line 1158.

What could it be? I am using an UBUNTU 18.04.

Thanks!

Simulating heterozygous SNPs?

Does this tool only simulate homozygous SNPs or is there an easy way to specify some proportion as homozygous and some proportion heterozygous?

bwa not generating sam file when provided with simulator's output

Hi,

I tried to simulate chr7.fa with 1000 random SNPs with the command

perl simuG.pl -refseq chr7.fa -snp_count 1000 -prefix simulated

Then I tried to generate sam file using bwa. Got following error

image

Would appreciate if you provide any pointers.

Please note, bwa worked very well with the original chr7.fa
Regards
Mrinal

Issue with -gene_gff file.gff -coding_partition_for_snp_simulation "coding"

Hi There,

simuG looks like a fantastic tool -- I'm very keen to use it to randomise several genomes. I'm trialling a few bacterial genomes and have generated GFFs from the EMBL/Genbank annotations, but get an error:

Use of uninitialized value $mRNA_id in hash element at simuG.pl line 954, <$fh> line 1.

Is it possible that simuG requires a more specific GFF format than what we are using? If so, is this documented anywhere? I was hoping it only required the CDS+strand regions to be known (which I can easily generate from EMBL files), rather than full mRNA/spliced models.

Best wishes,

Paul.

Uninitialize value when using gene gff option

Hi! This is a great program! I am just having some error using the gene-gff option. I am hoping to creates SNPs in hg38 genome. However, Ensembl 100 "Homo_sapiens.GRCh38.100.chr.gff3.gz", it keeps returning the following error.

Use of uninitialized value $primary_mRNA_id in hash element at scripts/simuG/simuG.pl line 991.

I made sure that the chromosome naming format matches the one for Ensembl GFF but it still results in this error. Do you have any suggestions in resolving this error?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.