Code Monkey home page Code Monkey logo

biotoolbox's People

Contributors

tjparnell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

biotoolbox's Issues

Issue with .gff3 feature recognition in script get_intersecting_features.pl

Hello,

I'm trying to run get_intersecting_features.pl with a .gff3 file for mm9 
downloaded from UCSC as the --db, gene as the --feature, and an input file of 
peak calls.  This is my call 
perl ~/src/biotoolbox/scripts/get_intersecting_features.pl --db 
/Users/Jeff/Desktop/Biotoolbox_031712/Genelists/upgenelist_mm9_refGene.gff3 
--in /Volumes/External\ HardDisk_1/Bruneau\ Lab\ Files/Projects/B2B\ 
Project/WI_analysis/k27me3/k27me3_S2R3_allButD4R1wce_110728_023243/1e-12/ENRICHE
D_REGIONS_k27me3_S2R3_allButD4R1wce_110728_023243 --out ~/Desktop/test.txt  
--start 0 --stop 0 --extend 1000 --ref mid --feature gene

Here is what the .gff3 looks like 
####
chr1    refGene gene    3204563 3661579 .   -   .   Name=Xkr4;ID=Xkr4;status=Provisional
chr1    refGene mRNA    3204563 3661579 .   -   .   Name=NM_001011874;Parent=Xkr4;ID=NM_0010
11874;Alias=Xkr4;status=Provisional
chr1    refGene exon    3204563 3207049 .   -   .   Name=NM_001011874.exon2;Parent=NM_001011
874;ID=NM_001011874.exon2
chr1    refGene exon    3411783 3411982 .   -   .   Name=NM_001011874.exon1;Parent=NM_001011
874;ID=NM_001011874.exon1
chr1    refGene exon    3660633 3661579 .   -   .   Name=NM_001011874.exon0;Parent=NM_001011
874;ID=NM_001011874.exon0
chr1    refGene three_prime_UTR 3204563 3206102 .   -   .   Name=NM_001011874.utr2;Parent
=NM_001011874;ID=NM_001011874.utr2
chr1    refGene five_prime_UTR  3661430 3661579 .   -   .   Name=NM_001011874.utr0;Parent=
NM_001011874;ID=NM_001011874.utr0

What is the expected output? What do you see instead?
When run, the script gives me this output:
###
A script to pull out overlapping features

 loaded file '/Volumes/External HardDisk_1/Bruneau Lab Files/Projects/B2B Project/WI_analysis/k27me3/k27me3_S2R3_allButD4R1wce_110728_023243/1e-12/ENRICHED_REGIONS_k27me3_S2R3_allButD4R1wce_110728_023243' with 8435 features
 Loading file into memory database...
 Requested feature 'gene' does not appear to be valid!
###

If I run the file without specifying a feature, it appears to list some hash 
reference and not the real values
 program to collect data for a list of features

 Generating a new feature list from database '/Users/Jeff/Desktop/Biotoolbox_031712/Genelists/upgenelist_mm9_refGene.gff3'...
 Loading file into memory database...
   Searching for gene
   Found 502 features in the database.
   Kept 502 features.
 Loading file into memory database...

 These are the available data sets in the database:
  1 HASH(0x100d57c60)
  2 HASH(0x1099419c0)


What version of the product are you using? On what operating system?
###
Biotoolbox script get_intersecting_features.pl, version 1.4.3

Please provide any additional information below.
I have used the .gff3 file successfully for running map_relative_data.pl and it 
worked fine so I don't think it is the database file.

Sorry for the trouble.  Any insight you would have would be very useful.  These 
are great scripts!

Jeff
Graduate Student
Bruneau Lab/Gladstone Institutes

Original issue reported on code.google.com by [email protected] on 25 Mar 2012 at 2:01

Truncated output, leftover files in bam2wig.pl

I am running bam2wig.pl with the following options
--extend --rpm --nope --shift --shiftval -21 --extval 50 --cpu 8 --nogz --bw

It seems like the negative shift value causes problems around the beginning of the chromosomes.
The whole output looks like this:

 Writing fixedStep format in 10 bp bins
 Alignments will be extended by 50 bp
 Forking into 8 children for parallel conversion
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -1 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -1 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -1 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -2 at /usr/local/bin/bam2wig.pl line 2436.
Modification of non-creatable array value attempted, subscript -1 at /usr/local/bin/bam2wig.pl line 2436.
 Normalizing depth based on 237,495,068 total counted alignments
 Merging temporary files
 Writing directly to bigWig converter
 generating chromosome file....
 no filename provided or associated with object! at /usr/local/share/perl/5.26.1/Bio/ToolBox/utility.pm line 248.
Can't call method "close" on an undefined value at /usr/local/bin/bam2wig.pl line 2016.

And there are more problems:

  • The resulting bigWig file contains signal only for chr1 and chr10. But the original BAM contains for all chromosomes. (There existed temporary files for other chromosomes during the execution time, but the values didn't end up in the bigWig output.)
  • There are leftover files in the working directory: 12 files *.f.temp.wig and one chr_sizes* file

Bio::ToolBox version 1.54

Error handling

I had a bunch of bam2wig.pl running and some of them got I/O errors (probably a local problem).

Nevertheless bam2wig.pl finished in each case and produced files with missing chromosomes!
I wish it rather failed and produced no output, than produced a truncated output which is hard to spot.

Errors could be due to my config. They look like this:

cannot write to file 'bam2wigTEMP_1evN/myfile.chr22.2118834.f.temp.bin' Input/output error

Memory Usage with bam2wig.pl

Hi Tim,

What steps will reproduce the problem?

I'm running bam2wig.pl to try and convert .bam files to .wig/.bw files in one 
step.  I've had success using the quick --coverage option, but I run into 
problems when I create bins rather than step at single bp resolution.  It seems 
that enabling the --bin option balloons my memory usage such that it uses all 
or nearly all memory available on my lab's server (~24GB) and the run typically 
fails.  I've noticed that the process_coverage subroutines appear to have a 
"dump" function that doesn't seem to be present in the process_alignment 
subroutine used when the --bin option is activated.  Could this have something 
to do with it?  Here is an example of a call that failed:

perl ~/source/biotoolbox/scripts/bam2wig.pl --in some.bam --position start 
--bin 50 --bw

 This program will convert bam alignments to enumerated wig data
 recording start positions
 Forking into 2 children for parallel conversion
Out of memory!


What version of the product are you using? On what operating system?

I'm working on a linux system but have had similar problems using MacOSX.
------
jeff@argus:~$ perl ~/source/biotoolbox/scripts/bam2wig.pl --version

 This program will convert bam alignments to enumerated wig data
 Biotoolbox script bam2wig.pl, version 1.12.1

Thanks for your help!

Jeff


Original issue reported on code.google.com by [email protected] on 21 Jul 2013 at 10:39

bam2wig error

What steps will reproduce the problem?
1. bam2wig conversion produce wig file that can't be converted bigWig

The command I use to generate wig file:
bam2wig.pl --pe --pos mid --strand --cpu 2 --in accepted_hits_paired.bam

Then I use wigToBigWig to convert wig to bigWig:
gzip -dc accepted_hits_paired_f.wig.gz | wigToBigWig stdin /chromInfo.txt 
accepted_hits_paired_f.bw

I received this error:
Overlap on chr1 between items starting at 19809404 and 19809404.
Please remove overlaps and try again

When I use "--bw", I also received conversion error message:

 converting accepted_hits_paired_f.wig to bigWig....
 Conversion failed. You should try manually and watch for errors
 Leaving temporary chromosome file 'chr_sizes_Lhxdl'
 BigWig conversion failed! see standard error for details


What version of the product are you using? On what operating system?
I tried both the biotoolbox v1.12 and the bam2wig.pl r631 in truck dated Jun 
18, 2013

Original issue reported on code.google.com by [email protected] on 5 Jul 2013 at 5:57

bam2wig.pl: Division by zero during rpm normalisation

Hi,

I bam2wig.pl crashes during normalisation step. Apparently $totals[0] in the line 1805 is an empty string.
I have 9 bam files from the same genome processed in exactly the same way. There are no problems for 3 of them, but the remaining 6 fail. I rerun with -V and checked that all alignments are counted correctly and although some of them have zero alignments the total goes in millions.

Version 1.68
Perl 5.32.1

bam2wig.pl -V --extend --rpm --nope --shift --shiftval $((-50/2 + 4)) --extval 50 --bin $binSize --in f1.mm10.sorted.bam --nogz --bw --out foo.bw
 This program will convert bam alignments to wig data
  Including secondary 0x100 reads
  Including duplicate 0x400 reads
  Including supplementary 0x800 reads
 Recording single-end shifted, extended alignment span
 Writing fixedStep format in 10 bp bins
 Writing temp files to /mnt/scratch/piotr/bam2wig_UpahTh/bam2wigTEMP_gloB
 Processing files ../../Mapped/f1.mm10.sorted.bam...
 Using the Bio::DB::Sam Bam adapter and align wrapper 
 Alignments will be extended by 50 bp
 Forking into 8 children for parallel conversion
  Converted 3,436,700 alignments on chr8 in 29 seconds
  Converted 3,643,578 alignments on chr7 in 31 seconds
  Converted 3,850,668 alignments on chr6 in 32 seconds
  Converted 3,997,847 alignments on chr5 in 34 seconds
  Converted 4,029,221 alignments on chr3 in 34 seconds
  Converted 4,069,201 alignments on chr4 in 35 seconds
  Converted 4,934,754 alignments on chr2 in 42 seconds
  Converted 4,941,632 alignments on chr1 in 42 seconds
  Converted 2,356,301 alignments on chr14 in 20 seconds
  Converted 3,393,314 alignments on chr9 in 29 seconds
  Converted 3,250,935 alignments on chr10 in 27 seconds
  Converted 3,037,476 alignments on chr12 in 26 seconds
  Converted 3,076,095 alignments on chr13 in 26 seconds
  Converted 95,714 alignments on chrY in 2 seconds
  Converted 3,564,766 alignments on chr11 in 30 seconds
  Converted 2,615 alignments on chr1_GL456210_random in 1 seconds
  Converted 2,516,301 alignments on chr16 in 21 seconds
  Converted 2,435 alignments on chr1_GL456212_random in 0 seconds
  Converted 3,538 alignments on chr1_GL456211_random in 0 seconds
  Converted 0 alignments on chr1_GL456213_random in 0 seconds
  Converted 2,019 alignments on chr1_GL456221_random in 0 seconds
  Converted 2,707,038 alignments on chr15 in 23 seconds
  Converted 43 alignments on chr4_JH584292_random in 0 seconds
  Converted 0 alignments on chr4_GL456350_random in 0 seconds
  Converted 1,396 alignments on chr4_GL456216_random in 0 seconds
  Converted 35 alignments on chr4_JH584295_random in 0 seconds
  Converted 0 alignments on chr4_JH584293_random in 0 seconds
  Converted 16 alignments on chr4_JH584294_random in 0 seconds
  Converted 2 alignments on chr5_JH584297_random in 0 seconds
  Converted 2 alignments on chr5_JH584296_random in 0 seconds
  Converted 282 alignments on chr5_GL456354_random in 0 seconds
  Converted 0 alignments on chr7_GL456219_random in 0 seconds
  Converted 0 alignments on chr5_JH584298_random in 0 seconds
  Converted 2,660 alignments on chr5_JH584299_random in 0 seconds
  Converted 0 alignments on chrY_JH584300_random in 0 seconds
  Converted 2 alignments on chrY_JH584301_random in 0 seconds
  Converted 3,801 alignments on chrX_GL456233_random in 0 seconds
  Converted 0 alignments on chrY_JH584302_random in 0 seconds
  Converted 0 alignments on chrY_JH584303_random in 0 seconds
  Converted 2,209 alignments on chrUn_GL456239 in 0 seconds
  Converted 506 alignments on chrUn_GL456359 in 0 seconds
  Converted 742 alignments on chrUn_GL456360 in 0 seconds
  Converted 1,154 alignments on chrUn_GL456366 in 0 seconds
  Converted 1,608,531 alignments on chr19 in 14 seconds
  Converted 509 alignments on chrUn_GL456367 in 0 seconds
  Converted 96 alignments on chrUn_GL456370 in 0 seconds
  Converted 462 alignments on chrUn_GL456372 in 0 seconds
  Converted 1,042 alignments on chrUn_GL456368 in 0 seconds
  Converted 266 alignments on chrUn_GL456382 in 0 seconds
  Converted 278 alignments on chrUn_GL456381 in 0 seconds
  Converted 870 alignments on chrUn_GL456378 in 0 seconds
  Converted 631 alignments on chrUn_GL456379 in 0 seconds
  Converted 431 alignments on chrUn_GL456383 in 0 seconds
  Converted 766 alignments on chrUn_GL456389 in 0 seconds
  Converted 737 alignments on chrUn_GL456385 in 0 seconds
  Converted 1,262 alignments on chrUn_GL456387 in 0 seconds
  Converted 436 alignments on chrUn_GL456390 in 0 seconds
  Converted 548 alignments on chrUn_GL456394 in 0 seconds
  Converted 1,232 alignments on chrUn_GL456392 in 0 seconds
  Converted 2,117 alignments on chrUn_GL456393 in 0 seconds
  Converted 2,463,763 alignments on chr17 in 21 seconds
  Converted 427 alignments on chrUn_GL456396 in 0 seconds
  Converted 57,141 alignments on chrUn_JH584304 in 0 seconds
  Converted 1,952,986 alignments on chrX in 17 seconds
  Converted 2,286,956 alignments on chr18 in 20 seconds
  Converted 3,886,331 alignments on chrM in 31 seconds
Illegal division by zero at /usr/local/bin/bam2wig.pl line 1805.
 Finished converting alignments in 1.567 minutes
 Normalizing depth based on  total counted alignments

Issue on installing Biotoolbox

Hello,

I am trying to download Bio::ToolBox
It looks like it is now downloaded, however when i run: perl ucsc_table2gff3.pl -f refgene -d hg19
it delivers an error:

Can't locate Bio/ToolBox.pm in @INC (you may need to install the Bio::ToolBox module) (@INC contains: /home/marissa/miniconda3/lib/perl5/5.32/site_perl /home/marissa/miniconda3/lib/perl5/site_perl /home/marissa/miniconda3/lib/perl5/5.32/vendor_perl /home/marissa/miniconda3/lib/perl5/vendor_perl /home/marissa/miniconda3/lib/perl5/5.32/core_perl /home/marissa/miniconda3/lib/perl5/core_perl .) at ucsc_table2gff3.pl line 9.
BEGIN failed--compilation aborted at ucsc_table2gff3.pl line 9.

Can you please tell me what is wrong ?
Thank you!

[feature request] allow reading SAM data from STDIN in bam2wig.pl

Could you please add an option of reading SAM or BAM file from standard input?
This would help running bam2wig.pl at the end of shell pipe. Bedtools and 
Samtools are able to work in pipes. Why not bam2wig.pl?

Being able to run something like "somecommand | samtools view -Sb - | 
bam2wig.pl --coverage --in -" would be great.

Original issue reported on code.google.com by [email protected] on 16 Dec 2014 at 6:53

[feature request] bam2wig.pl: Allow temporary files to be written to a temporary directory

bam2wig.pl produces temporary files (*temp.bin) in current directory. If the genome assembly has lots of contigs, there are lots of temporary files as well.

Issues with current behaviour:

  • Before bam2wig.pl finishes the directory is filled with temp files.
  • If bam2wig.pl is killed, these files will remain
  • User might have a much faster local scratch filesystem
  • tmp/scratch space normally would be cleaned of old files automatically

Proposed solutions:

  • If defined, use $TMPDIR to store temporary files. If $TMPDIR is not set use current directory.
  • Or allow user to explicitly specify the temporary directory. This might be better, because /tmp can be too small to hold wig files.

bar2wig.pl usage correspond to bam2gff script

What is the command line and arguments you are using?

>$ ./bar2wig.pl


What is the error generated, if there is one?

> This program will convert bar files to a wig file
>Usage:
>    /!\ bam2gff_bed.pl [--options...] <filename>
>      Options:
>      --in <filename>
>      --bed | --gff | --bigbed | --bb
>      --pe
>      --type <text>
>      --source <text>
>      --randstr
>      --out <filename> 
>      --bbapp </path/to/bedToBigBed>
>      --gz
>      --version
>      --help


What is the expected output? What do you see instead?


What version of the script are using? Use the --version option.

>$ ./bar2wig.pl --version
>
> This program will convert bar files to a wig file
> Biotoolbox script bar2wig.pl, version 1.14


Can you provide a small sample file that will reproduce the error? What
steps are
required to reproduce the error?


Please provide any additional information below.

Original issue reported on code.google.com by guillaume.tiberi on 4 Feb 2014 at 1:07

Bug in change_chr_prefix.pl

When converting a sam/bam file with change_chr_prefix.pl only the header 
changes. This is because the change_name() result is not written to the file.

Line 378 in change_chr_prefix.pl should be:
$newline = join("\t", @data);


Best,

Pascal

Original issue reported on code.google.com by [email protected] on 20 Nov 2011 at 4:20

missing "use Statistics::Descriptive" and shift value being calculated despite --shiftval option

When running bam2wig.pl I am getting the following error:

Can't locate object method "new" via package "Statistics::Descriptive::Sparse" (perhaps you forgot to load "Statistics::Descriptive::Sparse"?) at /usr/local/bin/bam2wig.pl line 887.

The output of the program is suggesting the shift value is being calculated despite --shiftval is explicitly given in the command line.

Command line:
bam2wig.pl --position start --nope --shift --shiftval 4 --in in.bam --cpu 4 --nogz --out out.wig

This program will convert bam alignments to wig data
Calculating 3' shift value...
sampling the top coverage regions on the largest 4 chromosomes...
Forking into children for parallel scanning
Scanning chr7, chr5, chr3, chr4
0 regions found with a correlative shift in 0.2 minutes
Can't locate object method "new" via package "Statistics::Descriptive::Sparse" (perhaps you forgot to load "Statistics::Descriptive::Sparse"?) at /usr/local/bin/bam2wig.pl line 737.

(This file contains reads only from ONE strand, 0 regions is therefore expected)

However this line succeeded:
perl -we 'require Statistics::Descriptive; my $stat = Statistics::Descriptive::Sparse->new;'

Perl v5.26.0
Debian testing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.