diriano / ploidyngs Goto Github PK
View Code? Open in Web Editor NEWExplore ploidy levels from NGS data alone
License: GNU General Public License v3.0
Explore ploidy levels from NGS data alone
License: GNU General Public License v3.0
When I test ploidyNGS use the file you provided, I'm getting the following error:
Traceback (most recent call last):
File "./ploidyNGS.py", line 90, in <module>
for l in pysam.idxstats(args.bam).split('\n'):
AttributeError: 'list' object has no attribute 'split'
Though I get these two files, there is no histogram in the pdf file.
diploidTest_depth100.tab
diploidTest_depth100.tab.PloidyNGS.pdf
Hello,
the output file .tab is generated nevertheless it remains empty, no errors or warnings are printed out.
trying to figure out the reason .. both installation and test went well.
thank you for your time
Dear developers of ploidyNGS,
I have been trying to run the script simulatePloidyData.py to generate a chromosome with heteromorphic loci at the given heterozygosity.
Initially after encountering some trivial errors with respect to python versions (ver 2 vs ver 3), and some straight forward deprecated function usage, I encounter a bit more complex errors. Firstly, the script is unable to handle ploidy levels below 2 (eg. --ploidy 1). Secondly, I get a ValueError when I try to run the script with ploidy 2 (see below).
Please advise on how to fix these errors or possible work-arounds.
Thank you
Best regards,
Abhijeet Shah
ploidy error:
python ~/Downloads/ploidyNGS/simulation/simulatePloidyData.py --genome MyChromosome.fasta --heterozygosity 0.01 --ploidy 1
Traceback (most recent call last):
File "/Downloads/ploidyNGS/simulation/simulatePloidyData.py", line 98, in <module>
randomDosageTable = random.randint(0,lenDosageTable-1)
File "/miniconda3/envs/ploidy/lib/python3.10/random.py", line 370, in randint
return self.randrange(a, b+1)
File "/miniconda3/envs/ploidy/lib/python3.10/random.py", line 353, in randrange
raise ValueError("empty range for randrange() (%d, %d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0, 0, 0)
ValueError:
python ~/Downloads/ploidyNGS/simulation/simulatePloidyData.py --genome MyChromosome.fasta --heterozygosity 0.01 --ploidy 2
Traceback (most recent call last):
File "/Downloads/ploidyNGS/simulation/simulatePloidyData.py", line 102, in <module>
GenomeAlph.remove(base)
ValueError: list.remove(x): x not in list
bro, ur code cant fit in python3. Python 2 has already eol now.
When I run the haploid test data it gives a ploidy of 6 when using -g option but the histogram shows a clear haploid ploidy, can you check that?
The simulation page mentions a useful-sounding script called explorePloidyNGS.py
, but I can't seem to find it in the repository.
My program was hanging out for a long time in the XXXsorted.bam.bai file, but no any error or increased of this XXXsorted.bam.bai file size.
Hi,
I have been running ploidyNGS for a whole day and I want to know how long it takes ploidyNGS to finish a job? And can it be multi-threaded? ploidyNGS is just keep runnning and doesn't produce any putputs or messages.
Best,
Quan
Hello,
I receive a different result for the test data and was hoping for your input. When running the test data and guessing ploidy, it guesses 6. This has been recreated on two independent machines, following your instructions. Both cases return ploidy of 6.
When running ./ploidyNGS.py --guess_ploidy --out myTest/DataTestPloidy1_guessPloidy --bam test_data/HaploidGenome/Ploidy1.bowtie2.sorted.bam
I get the output:
This is ploidyNGS version v3.1.2
Current date and time: Mon Oct 31 11:44:30 2022
BAM index present... OK!
Number of mapped reads from BAM: 206062
Observed average coverage: 51.44
Number of heteromorphic positions in NC_001133.9 : 5936
Total number of heteromorphic positions: 5936
Coverage used for guessing ploidy: 50
After comparing your data with our simulated dataset
and computing the Kolmogorov-Smirnov distance,
the closest ploidy to yours is 6
Do you what is happening? Thank you for your time.
When I'm trying out either the test or one of my own files, I'm getting the following error:
Traceback (most recent call last):
File "./ploidyNGS.py", line 98, in <module>
...
TypeError: expected string or Unicode object, file found
I pinpointed the mistake to be in ploidyNGS.py, at line 59
There the BAM file is opened, while at line 85 the file handle is parsed to pysam. This should be the file name as a string only. To solve I simply changed line 59;
bamOBJ = open(args.bam,"r")
to bamOBJ = args.bam
The solution seems trivial, but it's not reflected in the traceback
My program was hanging out for a long time in the XXXsorted.bam.bai file, but no any error or increased of this XXXsorted.bam.bai file size.
Dear ploidyNGS creator,
Thank you for your works and maintaining this git-hub. Your test dataset was OK on my installation.
I launch this command :
[userlocal@NTLT101 ploidyNGS]$ cd ~/ploidyNGS
[userlocal@NTLT101 ploidyNGS]$ source .venv/bin/activate
(.venv) [userlocal@NTLT101 ploidyNGS]$ ./ploidyNGS.py -o /PATH_OUTPUT/diploidTest -b /PATH_OUTPUT/all_sort.bam -d 50
###############################################################
###############################################################
No index available for pileup. Creating an index...
Number of mapped reads from BAM: 14590766
Killed
I suppose that my comptuer (not very powerfull) run out of memory. How to prevent such problems ?
1- It would be interessting to have some rough idea our memory consumption and/or duration of computation for a given computer architecture.
2- An other option would be to have some warmings before launching the computation and/or have an option to process the dataset by chunk.
I am thinking of writing a bash script based on samtools 1.4.x :
The planned step are :
1- Split the bam by contigs and if these contigs are too larges (compared to your test dataset)
2- They will be splitted in smaller bam.
3- Launch ./ploidyNGS.py in parallel in bash. I do not know if it is possible regarding your special environment "(.venv)"... Do you know if it is possible or not ?
Cheers,
JB
Dear Developer,
#sorry to raise an issue unnecessarily.
Below is what I got when I ran the tool,
(.venv) mml@MML:~/softwares/ploidyNGS$ ./ploidyNGS.py --guess_ploidy -o guess_test -b dedup_SS_BWA_reads.bam
###############################################################
###############################################################
BAM index present... OK!
Number of mapped reads from BAM: 3483177
Observed average coverage: 54.00
Coverage used for guessing ploidy: 50
After comparing your data with our simulated dataset
and computing the Kolmogorov-Smirnov distance,
the closest ploidy to yours is 3
And the image generated is attached below.
In the scientific field, confusion is haploid? or diploid?
Little information about the organism:
Fungi, 12.5mb genome, Illumina Miseq paired end reads.
Please go through the image and let me know what do you think.
Thanks
Bhagya C T
Hey there,
I want to run your Software but cant get it done because it says: "This script requires Python version 2.7.8 or higher within major version 2" where as Biopython (which is needed for your Software: "from Bio.SeqRecord import SeqRecord") needs Python 3.6 or later: "Biopython requires Python 3.6 or later. Python 2.7 detected."
Is there any solution?
Hi,
Do I want to know whether this software can be used for allopolyploid species?
Any help is much appreciated.
Thanks.
Best regards,
Dear all
I recently tested ploidyNGS, and found an error message below.
So, I suggest to fix ploidyNGS.py (line229) as shown below.
original : cmdPloidyGraphRscript="ploidyNGS_generateHistogram.R "
fixed : cmdPloidyGraphRscript="Rscript ploidyNGS_generateHistogram.R "
Best,
Dear all,
I installed ploidyNGS and tested ./ploidyNGS.py -o diploidTest -b test_data/simulatedDiploidGenome/Ploidy2.bowtie2.sorted.bam
After running, I got the output attached
diploidTest_MaxDepth100_MinCov0.tab.PloidyNGS.pdf
which is somewhat different from https://github.com/diriano/ploidyNGS/tree/master/images/diploidTest_depth100.tab.PloidyNGS.png
Are there any problems with the installation process?
Greetings,
I'm trying to use ploidyNGS to make predictions about putatively haploid/diploid datasets. I followed the documentation and received the following error:
./ploidyNGS.py --guess_ploidy -o DF_genome -d 100000 -b DF_I_DNA_genome_SORTED.bam
###############################################################
###############################################################
BAM index present... OK!
Number of mapped reads from BAM: 31817941
Traceback (most recent call last):
File "./ploidyNGS.py", line 130, in
averageCoverage=countTotalReads/countTotalPositions
ZeroDivisionError: integer division or modulo by zero
I'm guessing by the error output that the denominator in averageCoverage is zero, but that shouldn't be the case. My bam file contains a large eukaryotic genome with lots of scaffolds, but I wanted to test using the -d flag as a first pass before splitting bam into scaffolds. Any help is appreciated!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.