Code Monkey home page Code Monkey logo

brass's Introduction

BRASS

Quay Badge

Master Develop
Master Badge Develop Badge

Breakpoints via assembly

BRASS analyses one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.

There are several stages the main component being:

  1. Collect read-pairs where both ends map but NOT marked as properly-paired.
  2. Perform grouping based on mapped locations
  3. Filter
  4. Run assembly
  5. Annotate with GRASS

Quick installation

./setup.sh path_to_install_to

Skipping all external dependencies

If you want to only install the core of BRASS (C and perl wrappers) and use existing versions of tools from your path run as:

./setup.sh path_to_install_to 1

Skipping exonerate install

Central install via package manager of 2.2.0 is adequate. To skip just exonerate install run:

./setup.sh path_to_install_to 2

Pre-requisites

  • The C++ code (within this package) requires the presence of pstreams.h (and associated development libraries). This is not handled by the setup.sh script.

Perl packages:

Each of these has it's own dependencies.

R packages

A large number of R packages are required to run BRASS. To facilitate the install process there is a script Rsupport/libInstall.R that can be run to build these for you. See this file for the list of packages.

Alternatively you can run:

cd Rsupport
./setupR.sh path_to_install_to

Appending 1 to the command will request a complete local build of R (3.1.3).

Other tools that need to be in path

  • FASTA
    • If not done failures due to absence of ssearch36 will occur.
    • ssearch36 is the only program required.

Tools installed by setup.sh

Please use setup.sh to install these dependencies. Setting the environment variable CGP_PERLLIBS allows you to to append to PERL5LIB during install. Without this all dependancies are installed into the target area. setup.sh will not use PERL5LIB directly.

Please be aware that this expects basic C compilation libraries and tools to be available.

Running BRASS

This package includes a reference implementation which handles all of the linking together of steps.

Please see the -h and -m options of brass.pl for full usage information.

It can be run in a couple of ways:

  1. Fire and forget
  • Execute on a single host with multiple cores (or 1 if that's all you have)
  • Some efficiency overhead as some steps aren't parallel
  1. Farm style
  • Requires 2 extra parameters in the initial command
  • See -help for further details

Input

Initial mapping

BRASS has primarily been written to work with BWA mapped data. You are likely to get the most useful output from BWA-mem.

Library quality

Please be aware that paired-end libraries where properly-paired reads are heavily overlapped are unlikely to produce good results.

Additional mapping information

BRASS requires accurate information regarding the insert size distribution and expects to find a *.bam.bas file co-located with the *.bam's. These can be generated by the bam_stats program included in the PCAP-core project. If you use bwa_mem.pl to map your data (same repository) then this file is generated automatically for you.

Docker, Singularity and Dockstore

There are pre-built images containing this codebase on quay.io.

  • brass
    • Just this repo and any dependencies.
  • dockstore-cgpwgs
    • Contains additional tools for WGS analysis
    • This was primarily designed for use with dockstore.org but can be used as normal container

The docker images are know to work correctly after import into a singularity image.

LICENCE

Copyright (c) 2014-2019 Genome Research Ltd.

Author: CASM/Cancer IT <[email protected]>

This file is part of BRASS.

BRASS is free software: you can redistribute it and/or modify it under
the terms of the GNU Affero General Public License as published by the Free
Software Foundation; either version 3 of the License, or (at your option) any
later version.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
details.

You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.

1. The usage of a range of years within a copyright statement contained within
this distribution should be interpreted as being equivalent to a list of years
including the first and last year specified and all consecutive years between
them. For example, a copyright statement that reads ‘Copyright (c) 2005, 2007-
2009, 2011-2012’ should be interpreted as being identical to a statement that
reads ‘Copyright (c) 2005, 2007, 2008, 2009, 2011, 2012’ and a copyright
statement that reads ‘Copyright (c) 2005-2012’ should be interpreted as being
identical to a statement that reads ‘Copyright (c) 2005, 2006, 2007, 2008,
2009, 2010, 2011, 2012’."

brass's People

Contributors

andymenzies avatar edawson avatar jmarshall avatar keiranmraine avatar sb43 avatar yl3 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

brass's Issues

brass (c++): building failed, rearrgroup.cpp issues

When trying to build the last released version, ./setup.sh fails on "brass (c++)" step. When I try to run the commands from setup.sh brass block manually, I constantly get errors:

make -C cansam
requires boost/version.hpp in the same folder, so I have copied the boost folder in there
make -C c++
reports a variety of mistakes in rearrgroup.cpp and numerous warnings in alignment.h, such as:

c++ -Wall -Wextra -g -I../cansam -O2   -c -o rearrgroup.o rearrgroup.cpp
rearrgroup.cpp:172:7: error: invalid operands to binary expression ('std::ostream' (aka 'basic_ostream<char>') and 'const char *')
  out << aln.rname_c_str() << '\t' << aln.strand_char() << '\t'
  ~~~ ^  ~~~~~~~~~~~~~~~~~
../cansam/cansam/sam/header.h:340:15: note: candidate function not viable: no known conversion from 'const char *' to 'const sam::header' for 2nd argument
std::ostream& operator<< (std::ostream& stream, const header& header);
              ^
../cansam/cansam/sam/header.h:352:15: note: candidate function not viable: no known conversion from 'const char *' to 'const header::tagfield' for 2nd argument
std::ostream& operator<< (std::ostream& stream, const header::tagfield& field);
              ^
../cansam/cansam/sam/header.h:355:15: note: candidate function not viable: no known conversion from 'const char *' to 'header::const_iterator' for 2nd argument
std::ostream& operator<< (std::ostream& stream, header::const_iterator it);
              ^
../cansam/cansam/sam/header.h:611:15: note: candidate function not viable: no known conversion from 'const char *' to 'const sam::collection' for 2nd argument
std::ostream& operator<< (std::ostream& stream, const collection& headers);
              ^
../cansam/cansam/sam/alignment.h:751:15: note: candidate function not viable: no known conversion from 'const char *' to 'const sam::alignment' for 2nd argument
std::ostream& operator<< (std::ostream& stream, const alignment& aln);
              ^
../cansam/cansam/sam/alignment.h:777:15: note: candidate function not viable: no known conversion from 'const char *' to 'const alignment::tagfield' for 2nd argument
std::ostream& operator<< (std::ostream& stream, const alignment::tagfield& aux);
              ^
../cansam/cansam/sam/alignment.h:785:15: note: candidate function not viable: no known conversion from 'const char *' to 'alignment::const_iterator' for 2nd argument
std::ostream& operator<< (std::ostream& stream, alignment::const_iterator it);

Please tell me how can I fix that.

depth parameter

I have tried to run brass.pl but get error like:
"Option 'depth' has not been defined".
Not quite sure what I should put there. Should I put number like "-d ###", or file "-d file", if it is file, what file, bas file? I have bas file generated in the same directory. Here is the command line:
brass.pl -o outfolder -t tumor.bam -n normal.bam -r genome.fa;
the names of bas files are tumor.bam.bas and normal.bam.bas . Any suggestions would be appreciated.
Yonghong Wang

error in normal contamination step

Warning: pcf is not run for sample 1 on chromosome arm because all observations are missing. NA is returned.
Error in data.frame(rep(sampleid[i], nSeg), seg.chrom, seg.arm, pos.start, :
arguments imply differing number of rows: 1, 0
Calls: pcf -> data.frame
In addition: Warning message:
In numericChrom(chrom) : NAs introduced by coercion
Execution halted

: could be due to non-nemric chromosme names in grch38

bedpe ann error

Good Morning:
I am trying to use brass.pl recently but unfortunately got error at adding annotation step to the bedpe file. Here is the error I got:

$ more Sanger_CGP_Brass_Implement_grass.0.err
+ bash -c 'set -o pipefail; (cat /scratch/wangyong/dis/0A4HY5/tmpBrass/assemble/bedpe.* | sort -k1,1 -k 2,2n > /scratch/wangyong/dis/0A4HY5/0A4HY5_Tumor_vs_0A4
HY5_Normal.assembled.bedpe)'
+ /usr/bin/perl /opt/wtsi-cgp/bin/grass.pl -genome_cache /data/CCRBioinfo/wangyh/brass_file/chr_vagrent.human.GRCh37.homo_sapiens_91_37.cache -ref /data/CCRBio
info/wangyh/chr_genome.fa -species human -assembly GRCh37 -platform ILLUMINA -protocol WXS -tumour 0A4HY5_Tumor -normal 0A4HY5_Normal -file /scratch/wangyong/d
is/0A4HY5/0A4HY5_Tumor_vs_0A4HY5_Normal.assembled.bedpe -add_header brassVersion=6.1.2
+ /usr/bin/perl /opt/wtsi-cgp/bin/grass.pl -genome_cache /data/CCRBioinfo/wangyh/brass_file/chr_vagrent.human.GRCh37.homo_sapiens_91_37.cache -ref /data/CCRBio
info/wangyh/chr_genome.fa -species human -assembly GRCh37 -platform ILLUMINA -protocol WXS -tumour 0A4HY5_Tumor -normal 0A4HY5_Normal -file /scratch/wangyong/d
is/0A4HY5/0A4HY5_Tumor_vs_0A4HY5_Normal.groups.clean.bedpe -add_header brassVersion=6.1.2
+ /usr/bin/perl /opt/wtsi-cgp/bin/combineResults.pl /scratch/wangyong/dis/0A4HY5/0A4HY5_Tumor_vs_0A4HY5_Normal_ann.groups.clean /scratch/wangyong/dis/0A4HY5/0A
4HY5_Tumor_vs_0A4HY5_Normal_ann.assembled /scratch/wangyong/dis/0A4HY5/0A4HY5_Tumor_vs_0A4HY5_Normal.annot 2 0.75 DEFAULT
Can't open '/scratch/wangyong/dis/0A4HY5/0A4HY5_Tumor_vs_0A4HY5_Normal_ann.assembled.bedpe' for reading: 'No such file or directory' at /opt/wtsi-cgp/bin/combi
neResults.pl line 178
Command exited with non-zero status 255
0.39user 0.10system 0:00.53elapsed 93%CPU (0avgtext+0avgdata 27544maxresident)k
530inputs+296outputs (1major+17427minor)pagefaults 0swaps  

The command line I used is:

brass.pl -o 0A4HY5 -t 0A4HY5_Tumor.realigned.md.bam -n 0A4HY5_Normal.realigned.md.bam -d /data/CCRBioinfo/wangyh/brass_file/depth_final.bed -g /data/CCRBioinfo/wangyh/chr_genome.fa -s human -as GRCh37 -pr WXS -gc /data/CCRBioinfo/wangyh/brass_file/chr_vagrent.human.GRCh37.homo_sapiens_91_37.cache -vi /data/CCRBioinfo/wangyh/brass_file/viral.genomic.fa -mi /data/CCRBioinfo/wangyh/brass_file/all_bacteria.fa -b /data/CCRBioinfo/wangyh/brass_file/re_gcBins.bed.gz -cb /data/CCRBioinfo/wangyh/brass_file/cytoband.txt -ct /data/CCRBioinfo/wangyh/brass_file/chr_centTelo.tsv 

All the files listed under 0A4HY5 folder are:

0A4HY5_Tumor.insert_size_distr                                0A4HY5_Tumor_vs_0A4HY5_Normal.ngscn.abs_cn.bg
0A4HY5_Tumor_vs_0A4HY5_Normal.assembled.bedpe                 0A4HY5_Tumor_vs_0A4HY5_Normal.ngscn.abs_cn.bg.rg_cns
0A4HY5_Tumor_vs_0A4HY5_Normal.cn_filtered                     0A4HY5_Tumor_vs_0A4HY5_Normal.ngscn.segments.abs_cn.bg
0A4HY5_Tumor_vs_0A4HY5_Normal.groups                          0A4HY5_Tumor_vs_0A4HY5_Normal.r2
0A4HY5_Tumor_vs_0A4HY5_Normal.groups.clean.bedpe              0A4HY5_Tumor_vs_0A4HY5_Normal.r3
0A4HY5_Tumor_vs_0A4HY5_Normal.groups.filtered.bedpe           0A4HY5_Tumor_vs_0A4HY5_Normal.r4
0A4HY5_Tumor_vs_0A4HY5_Normal.groups.filtered.bedpenohead     0A4HY5_Tumor_vs_0A4HY5_Normal.r5
0A4HY5_Tumor_vs_0A4HY5_Normal.groups.filtered.bedpe.preclean  0A4HY5_Tumor_vs_0A4HY5_Normal.r5.scores
0A4HY5_Tumor_vs_0A4HY5_Normal.inversions.pdf                  0A4HY5_Tumor_vs_0A4HY5_Normal.r6
0A4HY5_Tumor_vs_0A4HY5_Normal.is_fb_artefact.txt              tmpBrass 

Seems to me the program failed to add annotation information to the bedpe file so could not generate the "ann.assembled.bedpe" file. Would you please let me know why this happens and how to address the problem? Any suggestions would be very appreciated.
Best
Yonghong

Add Contributors file

Add details of historical contributors to original codebase

John Marshall - C++ code

Request: Docker installation file for distribution

This is a reckless and improper request, but is it possible to create a Docker file to properly install this software? Considering the dependencies (and the time spent between myself and my system admin installing BRASS), it may be useful for replicable science. Your users would be very appreciative :)

match_rg_patterns_to_library.pl could not pass the compilation

I have installed the BRASS 5.0.0 version in my linux computer.
When I run the match_rg_patterns_to_library.pl script, it could not pass the compilation.
It seems the script "Genome.pm" has bugs.

Throw following errors:
values on reference is experimental at BRASS-dev/perl/bin/../lib/Sanger/CGP/CopyNumber/Segment/Genome.pm line 473.
Compilation failed in require at ./match_rg_patterns_to_library.pl line 50.
BEGIN failed--compilation aborted at ./match_rg_patterns_to_library.pl line 50.

brass.pl cleanup step - ascat files

The ascat input files area currently assumed to be in the base of the output location. This is not necessarily the case and they should be pulled from the locations specified in the initial command line.

Error in get_abs_bkpts_from_clipped_reads.pl

I've done BRASS sucessfully using several samples, but recently encounter an error at filter.abs_bkp step.

informations in Sanger_CGP_Brass_Implement_filter.abs_bkp.err:

No high_end reads for record 255731
No high_end reads for record 255751
Resolving overlapping rg end ranges...
Could not properly separate the reads between the low ends of rearrangements 50230 and 50231!
Use of uninitialized value $clip_pos in numeric ge (>=) at /mnt/data/hfn/Tools/CGP/BRASS-dev/perl/bin/get_abs_bkpts_from_clipped_reads.pl line 469.
1148718.56user 66873.16system 348:59:13elapsed 96%CPU (0avgtext+0avgdata 16487284maxresident)k
1615627960inputs+11344outputs (17829major+37930363131minor)pagefaults 0swaps

Any suggestions would be appreciated
Thanks

Implement_group error

Hi!

I keeps having these two kind of error
One from Sanger_CGP_Brass_Implement_filter.rg_cns.sh and the other form Sanger_CGP_Brass_Implement_group.0.sh (see below).
Does anyone could please help me here?

Thanks!

`"/usr/bin/time /var/spool/ref/H_TQ-SEQ-073/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_filter.rg_cns.sh 1> /var/spool/ref/H_TQ-SEQ-073/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_filter.rg_cns.out 2> /var/spool/ref/H_TQ-SEQ-073/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_filter.rg_cns.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268

"/usr/bin/time /var/spool/ref/H_TQ-SEQ-120/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.sh 1> /var/spool/ref/H_TQ-SEQ-120/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.out 2> /var/spool/ref/H_TQ-SEQ-120/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268

"/usr/bin/time /var/spool/ref/H_TQ-SEQ-522/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.sh 1> /var/spool/ref/H_TQ-SEQ-522/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.out 2> /var/spool/ref/H_TQ-SEQ-522/res/tmpBrass/logs/Sanger_CGP_Brass_Implement_group.0.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268`

Error installing BRASS-c++ on a clean Ubuntu Xenial machine

root@4255178edeb1:/home/ubuntu/BRASS# ./setup.sh install
(...)
g++  -o test/runtests test/runtests.o test/alignment.o test/header.o test/sam.o test/wire.o test/interval.o libcansam.a -lz
make: Leaving directory '/home/ubuntu/BRASS/cansam'
make: Entering directory '/home/ubuntu/BRASS/c++'
g++ -Wall -Wextra -g -I../cansam -O2   -c -o augment-bam.o augment-bam.cpp
augment-bam.cpp: In function 'int main(int, char**)':
augment-bam.cpp:158:46: error: 'BRASS_VERSION' was not declared in this scope
       std::cout << "augment-bam (Brass) " << BRASS_VERSION << "\n" << copyright;
                                              ^
In file included from augment-bam.cpp:49:0:
../cansam/cansam/sam/stream.h: At global scope:
../cansam/cansam/sam/stream.h:45:31: warning: 'sam::sam_format' defined but not used [-Wunused-variable]
 const std::ios_base::openmode sam_format
                               ^
<builtin>: recipe for target 'augment-bam.o' failed
make: *** [augment-bam.o] Error 1
make: Leaving directory '/home/ubuntu/BRASS/c++'
 failed.  See setup.log file for error messages. Failed to build brass (c++).
    Please check INSTALL file for items that should be installed by a package manager

Unreported Blat Dependency

Hi Guys,

I have noted that there is now a dependency on the blat utility. It might be worth mentioning on the front page as this utility has a dramatically different licence agreement to the licence under which BRASS itself is being released.

As it is automatically downloaded as part of the setup.sh script mentioning up front might save people from being torpedoed.

:)

BRASS filter generation fails with an uninformative error when *.bas not present

I was working through generating a PON and got a strange error when running the brassI_np_in.pl script:

> brassI_np_in.pl `pwd` 1 /dat/redacted.20k.bam
Starting: /opt/wtsi-cgp/bin/bamcollate2 outputformat=sam exclude=PROPER_PAIR,UNMAP,MUNMAP,SECONDARY,QCFAIL,DUP,SUPPLEMENTARY mapqthres=6 classes=F,F2 T=/home/ubuntu/redacted/tmpMap/bamcollate2 filename=/dat/redacted.20k.bam | /usr/bin/perl /opt/wtsi-cgp/bin/brassI_prep_bam.pl -b /dat/redacted.20k.bam.bas -np | /opt/wtsi-cgp/bin/bamsort inputformat=sam verbose=0 index=1 md5=1 tmpfile=/home/ubuntu/redacted/tmpMap/bamsort md5filename=/home/ubuntu/redacted/redacted.brm.bam.md5 indexfilename=/home/ubuntu/redacted/redacted.brm.bam.bai O=/home/ubuntu/redacted/redacted.brm.bam
ScramDecoder::readAlignment(): failed to read alignment without reaching EOF

/opt/wtsi-cgp/bin/../lib/libmaus2.so.2(libmaus2::util::StackTrace::StackTrace()+0x54)[0x7f42b94b2a54]
/opt/wtsi-cgp/bin/bamsort(libmaus2::exception::LibMausException::LibMausException()+0x20)[0x454550]
/opt/wtsi-cgp/bin/bamsort(libmaus2::bambam::ScramDecoder::readAlignmentInternal(bool)+0x2b3)[0x47b7b3]
/opt/wtsi-cgp/bin/bamsort(bamsort(libmaus2::util::ArgInfo const&)+0x2519)[0x44e309]
/opt/wtsi-cgp/bin/bamsort(main+0x177e)[0x44724e]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f42b7da2f45]
/opt/wtsi-cgp/bin/bamsort()[0x44819f]

WARNING: SAM header designates more than one PG tree root by PP tags.
WARNING: PG line with id bwamem.2 has multiple children referencing it by PP tags.
WARNING: PG line with id bwamem.3 has multiple children referencing it by PP tags.
WARNING: PG line with id bwamem.4 has multiple children referencing it by PP tags.
WARNING: PG line with id bwamem.8 has multiple children referencing it by PP tags.

I grabbed the command line invocations from the first line, started running them individually and found that the second command fails because there's no .bas file:

/opt/wtsi-cgp/bin/bamcollate2 outputformat=sam exclude=PROPER_PAIR,UNMAP,MUNMAP,SECONDARY,QCFAIL,DUP,SUPPLEMENTARY mapqthres=6 classes=F,F2 T=/home/ubuntu/redacted/tmpMap/bamcollate2 filename=/dat/redacted.20k.bam | /usr/bin/perl /opt/wtsi-cgp/bin/brassI_prep_bam.pl -b /dat/redacted.20k.bam.bas -np
File not found: /dat/redacted.20k.bam.bas

Creating a .bas file fixes the issue.

Is it possible to propagate the underlying error better?

pstream.h is missing

g++ -Wall -Wextra -g -I../cansam -O2   -c -o augment-bam.o augment-bam.cpp
augment-bam.cpp:32:30: error: pstreams/pstream.h: No such file or directory

Am I missing a package which provides it?

Blat path not exist

Good Morning:
I was running brass.pl with all the necessarily files (hopefully). It works fine for the first a few hours but quit later. Checking the log file, it shows:

Running BLAT on reads against NCBI bacterial database...
FAILED: /opt/wtsi-cgp/bin/blat .....

It seems to us the program can't find the path for the "blat", and there is no such path in our system. Could you please let me know how to address this issue? or maybe how to manually change the path so that the program can find the blat sitting in our system.
Thanks for the help

Homozygous deletion no detected by BRASS

Dear all,

Hoping you are having a great week. I am running BRASS for the identification of structural variants in a paired tumour and normal WGS data. Before I run BRASS, I checked a region, around 86 kb, on chr14 that I already knew it was deleted in both alleles using IGV. None reads mapped this region in the tumour sample, but it was completely fully covered with high depth in the germline sample. Unfortunately, it was not annotated by BRASS/GRASS in the vcf file. Do you know why this is happening?

Thanks in advance for your help!

Yurany

Option 'normals' has not been defined

What does this error mean?

$ brass-group -I extremedepth.bed -F brassRepeats.bed.gz brass/Adcc11T_Adcc11N/tmpBrass/Adcc11T.brm.bam brass/Adcc11T_Adcc11N/tmpBr
ass/Adcc11N.brm.bam | perl-static brassI_pre_filter.pl -i - -t Adcc11T -o brass/Adcc11T_Adcc11N/Adcc11T_vs_Adcc11N.groups
Option 'normals' has not been defined.

CN filtering locked to hg19

Changes required to correctly set this value when calling this package.

Additionally affects possibilities of moving to new builds, detecte when tested with grch38 and has been reported to corresponding author:

Question: Supporting grch38 and making the species agnostic pcf function in
copynumber package

Would you please include grch38 and other genomes into copynumber package or
make it species agnostic by supplying user defined chromosome bands.

Currently pcf function only supports following genome builds.

if (!assembly %in% c("hg19", "hg18", "hg17", "hg16", "mm7","mm8", "mm9")) {
  stop("assembly must be one of hg19, hg18, hg17 or hg16",call. = FALSE)
}

"keys on reference" warning prevents installation

Installing BRASS 6.0.5 using Perl 5.22.0 (using the setup.sh script), I received this error:

PERL_DL_NONLAZY=1 "/shared/ucl/apps/perl/perlbrewroot/perls/perl-5.22.0/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/1_pm_compile.t ....... 3/? 
#   Failed test 'use Bio::Brass::Merge;'
#   at t/1_pm_compile.t line 67.
#     Tried to use 'Bio::Brass::Merge'.
#     Error:  keys on reference is experimental at /scratch/scratch/uccaiki/CancerIT-login05/BRASS-6.0.5/perl/blib/lib/Bio/Brass/Merge.pm line 159.
# Compilation failed in require at t/1_pm_compile.t line 67.
# BEGIN failed--compilation aborted at t/1_pm_compile.t line 67.
Bailout called.  Further testing stopped:  Unable to 'use' module Bio::Brass::Merge
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 6.
FAILED--Further testing stopped: Unable to 'use' module Bio::Brass::Merge

While this is normally a warning, it's been upgraded to an error by use warnings FATAL => 'all';.

If I'm reading this correctly, the Perldoc on the keys function suggests this should be removed in any case, since imminent versions of Perl will not let this work:

Starting with Perl 5.14, an experimental feature allowed keys to take a scalar expression. This experiment has been deemed unsuccessful, and was removed as of Perl 5.24.

The relevant line is:

my $most_reads = (sort {$a<=>$b} keys $best{$most_samples})[-1];

I think this is fixable just by using %{} to convert the reference to a proper hash?

my $most_reads = (sort {$a<=>$b} keys %{ $best{$most_samples} })[-1];

This seemed to remove the warning/error and let it install, anyway: I haven't checked the result.

Clarification: Penalty for finding aberrant pairs for PON separately vs jointly

Due to wacky limitations on the way the cloud analysis system I'm using works, it's impossible for me to run the first step in PON filter generation (brassI_np_in.pl) jointly across all samples at once. At best I could run subsets of maybe 5-10 out of a total 393 samples.

Is there a penalty to doing this step separately for each normal BAM? My intuition is that this stage is just collecting aberrant read pairs, which should be independent across samples. My understanding of the wiki page is that this is the case since the next stage is to actually merge the brm.bam files.

building 'brass/c++' on Mac OS

Sorry for posting multiple issues, I managed to solved everything, but I also thought it could help others who would try to do the same. I am trying to build BRASS in OS X El Capitain 10.11.4.
The problem arose while trying to build 'brass' with sudo ./setup.sh /installation_path.
make -C cansam works after adding pstreams/pstreams.h (it still wants it to be in the same folder although it's not present in the archive that setup.h unzips)
make -C c++ goes fine till g++ -Wall -Wextra -g -I../cansam -O2 -c -o rearrgroup.o rearrgroup.cpp, where default Mac compiler clang++ throws multiple exceptions of this type:

error: invalid operands to binary expression

After that I could only proceed by switching to non-Mac g++ compiler, and was still getting multiple errors of this type:

Undefined symbols for architecture x86_64

Then I tried copying all the dependencies from brass-group.cpp and rearrgroup.cpp into feature.h and rearrgroup.h, and after that the whole compilation of c++ part went fine.
So I think it may make sense to copy these dependencies into .h files from the very beginning, if installing this package on Mac.

perl test failed

Compilation test failed. Looks similar to the earlier one.

perl -c bin/brassI_prep_bam.pl
Type of arg 1 to keys must be hash (not hash element) at bin/brassI_prep_bam.pl line 185, near "};"
bin/brassI_prep_bam.pl had compilation errors.

Line: 185

      @bas_rgs = keys $bas_ob->{'_data'};

error and quit for brass run

I was running brass.pl but the process was quit with error like:

"/usr/bin/time teem/tmpBrass/logs/Sanger_CGP_Brass_Implement_input.1.sh 1> teem/tmpBrass/logs/Sanger_CGP_Brass_Implement_input.1.out 2> teem/tmpBrass/logs/Sanger_CGP_Brass_Implement_input.1.err" unexpectedly returned exit value 1 at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 270.
at /opt/wtsi-cgp/lib/perl5/PCAP/Threaded.pm line 268

my command line is like:

brass.pl -o teem -t Tumor.realigned.md.bam -n Normal.realigned.md.bam -d depth_final.bed -g chr_genome.fa -s human -as GRCh37 -pr WGS -gc vagrent.human.GRCh37.homo_sapiens_91_37.cache.gz -vi viral.genomic.fa -mi all_bacteria.fa -b gcBins.bed -cb cytoband.txt -ct centTelo.tsv

for the sh file generated, seems it added option like -p T -i:
#!/bin/bash
set -eux
bash -c 'set -o pipefail; /opt/wtsi-cgp/bin/samtools view -F 3854 -q 6 -u Tumor.realigned.md.bam | /opt/wtsi-cgp/bin/bedtools intersect
-ubam -v -abam stdin -b depth_final.bed | /opt/wtsi-cgp/biobambam2/bin/bamcollate2 outputformat=sam exclude= classes=F,F2 T=/teem/tmpBrass/bamcollate2_1 | /usr/bin/perl /opt/wtsi-cgp/bin/brassI_prep_bam.pl -b Tumor.realigned.md.bam.bas -p T
-i | /opt/wtsi-cgp/biobambam2/bin/bamsort tmpfile=/teem/tmpBrass/bamsort_1 inputformat=sam verbose=0 index=1 md5=1 md5filename=/teem/tmpBrass/Tumor.brm.bam.md5 indexfilename=/teem/tmpBrass/Tumor.brm.bam.bai O=/scratch/wangyong/all/teem/tmpBrass/Tumor.brm.bam'

The error file is:

  • bash -c 'set -o pipefail; /opt/wtsi-cgp/bin/samtools view -F 3854 -q 6 -u Tumor.realigned.md.bam | /opt/wtsi-cgp/bin/bedtools interse
    ct -ubam -v -abam stdin -b /depth_final.bed | /opt/wtsi-cgp/biobambam2/bin/bamcollate2 outputformat=sam exclude= classes=F,F2 T=/teem/tmpBrass/bamcollate2_1 | /usr/bin/perl /opt/wtsi-cgp/bin/brassI_prep_bam.pl -b /Tumor.realigned.md.bam.bas -p
    T -i | /opt/wtsi-cgp/biobambam2/bin/bamsort tmpfile=/teem/tmpBrass/bamsort_1 inputformat=sam verbose=0 index=1 md5=1 md5filename=/teem/tmpBrass/Tumor.brm.bam.md5 indexfilename=/teem/tmpBrass/Tumor.brm.bam.bai O=teem/tmpBrass/Tumor.brm.bam'
    Option i requires an argument
    ScramDecoder::readAlignment(): failed to read alignment without reaching EOF

/opt/wtsi-cgp/biobambam2/bin/../lib/libmaus2.so.2(libmaus2::util::StackTrace::StackTrace()+0x4c)[0x2aaaaadb2e3c]
/opt/wtsi-cgp/biobambam2/bin/bamsort(libmaus2::exception::LibMausException::LibMausException()+0x20)[0x420410]
/opt/wtsi-cgp/biobambam2/bin/bamsort()[0x448f53]
/opt/wtsi-cgp/biobambam2/bin/bamsort()[0x4191a1]
/opt/wtsi-cgp/biobambam2/bin/bamsort()[0x4124d5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x2aaaae52e830]
/opt/wtsi-cgp/biobambam2/bin/bamsort()[0x412be2]

WARNING: SAM header designates more than one PG tree root by PP tags.
Command exited with non-zero status 1
1.72user 0.26system 0:01.43elapsed 138%CPU (0avgtext+0avgdata 45492maxresident)k
11440inputs+32outputs (88major+12763minor)pagefaults 0swaps

It shows "Option i requires an argument". Not sure how this option gets into the command line and also how to solve the problem. Thanks for your help. By the way, in order to make the files less crowded, I have manually removed the absolute long path of most of my input files (if not all) in the content above.

regards
Yonghong

Values for optional field not being handled

Need to handle absence of sample stats file from ascatNgs:

        -sampstat  -ss  ASCAT sample statistics file or file containing
                          NormalContamination 0.XXXXX [0.25]
                          Ploidy X.XXX [2.0]

If absent should apply the indicated default values.

pstreams/pstream.h: No such file or directory

I don't have root access and I am using the build from here https://github.com/cancerit/BRASS/archive/v4.0.5.tar.gz.
I cloned pstreams using git clone git://git.code.sf.net/p/pstreams/code pstreams in /path/to/BRASS-4.0.5/ because setup.log says it is searching for pstreams/pstream.h. But it is still unable to find the pstream.h file.

make: Entering directory `/ifs/work/leukgen/opt/BRASS-4.0.5/c++'
g++ -Wall -Wextra -g -I../cansam -O2   -c -o augment-bam.o augment-bam.cpp
augment-bam.cpp:44:30: error: pstreams/pstream.h: No such file or directory
augment-bam.cpp: In function ‘int main(int, char**)’:
augment-bam.cpp:197: error: ‘redi’ has not been declared
augment-bam.cpp:197: error: expected ‘;’ before ‘processbuf’
augment-bam.cpp:198: error: ‘processbuf’ was not declared in this scope
../cansam/cansam/sam/stream.h: At global scope:
../cansam/cansam/sam/stream.h:45: warning: ‘sam::sam_format’ defined but not used
make: *** [augment-bam.o] Error 1
make: Leaving directory `/ifs/work/leukgen/opt/BRASS-4.0.5/c++'

bottle-necks: investigate

There are a couple of bottle-necks in the filter step, abs_bkp and remap_micro.

These are only a problem in very noisy samples (e.g. 400k+ events on entry to abs_bkp) but can push the runtime into weeks/months so need to be investigated.

Likely that the filter step needs to be broken up as the individual elements in this case have variable resource requirements.

ssearch36 dependancy not documented or installed

Need to add install and dependancy documentation for ssearch36.

Need this:

curl -L -o tmp.tar.gz --retry 10 https://github.com/wrpearson/fasta36/releases/download/v36.3.8d_13Apr16/fasta-36.3.8d-linux64.tar.gz
mkdir  /tmp/downloads/fasta
tar -C /tmp/downloads/fasta --strip-components 2 -zxf tmp.tar.gz
cp /tmp/downloads/fasta/bin/ssearch36 $OPT/bin/.
rm -rf /tmp/downloads/fasta

Where the user/setup script should set $OPT to the base of there install area.

Brass produces output but returns code 255

I ran brass.pl on the hcc test data provided here and the CGP bundle for BRASS >= v6. I get the following files back:

HCC1143.insert_size_distr 
HCC1143_vs_HCC1143_BL.inversions.pdf
HCC1143_vs_HCC1143_BL.r2
HCC1143_vs_HCC1143_BL.groups
HCC1143_vs_HCC1143_BL.is_fb_artefact.txt
HCC1143_vs_HCC1143_BL.r3
HCC1143_vs_HCC1143_BL.groups.filtered.bedpe
HCC1143_vs_HCC1143_BL.ngscn.abs_cn.bg
HCC1143_vs_HCC1143_BL.r4
HCC1143_vs_HCC1143_BL.groups.filtered.bedpe.preclean
HCC1143_vs_HCC1143_BL.ngscn.diagnostic_plots.pdf  tmpBrass
HCC1143_vs_HCC1143_BL.groups.filtered.bedpenohead
HCC1143_vs_HCC1143_BL.ngscn.segments.abs_cn.bg

but the return code that gets sent to the workflow engine is 255, indicating some sort of failure.

Are there any docs I missed on the correct full set of outputs? I'm having trouble debugging what failed because there's no stdout/stderr output.

Unstable results - multiple possible normal panel intersections

It is possible for the normal panel to include multiple possible hits which overlap with an event.

It is also possible for both events to have the same number of contributing samples and within those the same number of total contributing reads.

The only way to cope with this is to order the data and select the first entry after binning by:

  1. Total samples contributing
  2. Total reads from contributing samples

get_rg_cns.R fails when sample does not have any reads in the segment

I've investigated by stepping through the R script, and this is occurring at this command:

bedtools coverage -d -abam gg -b get_rg_cns_tmp.bam.subset.segs | bedtools groupby -g 1,2,3 -c 5 -o mean | sort k2,2n

There are no reads in the file get_rg_cns_tmp.bam.subset.

tmpBrass/logs/Sanger_CGP_Brass_Implement_filter.rg_cns.err:

+ /exports/igmm/software/pkg/el7/apps/R/3.4.1/bin/Rscript /exports/igmm/software/pkg/el7/apps/BRASS/6.1.2-1/lib/perl5/auto/share/module/Sanger-CGP-Brass-Implement/Rscripts/get_rg_cns.R /gpfs/igmmfs01/eddie/HGS-OvarianCancerA-SGP-WGS/variants/structural/brass/output/WW00246a/WW00246a_vs_WW00246b.r5 /gpfs/igmmfs01/eddie/HGS-OvarianCancerA-SGP-WGS/variants/structural/brass/output/WW00246a/WW00246a_vs_WW00246b.ngscn.abs_cn.bg /gpfs/igmmfs01/eddie/HGS-OvarianCancerA-SGP-WGS/variants/structural/brass/output/WW00246a/WW00246a_vs_WW00246b.ngscn.segments.abs_cn.bg /exports/igmm/eddie/HGS-OvarianCancerA-SGP-WGS/upload/2017-06-28.tumor-WW00246a/WW00246a/WW00246a-ready.bam 0.75 /exports/igmm/eddie/HGS-OvarianCancerA-SGP-WGS/variants/structural/brass/centtel.tsv Y Y /gpfs/igmmfs01/eddie/HGS-OvarianCancerA-SGP-WGS/variants/structural/brass/output/WW00246a/tmpBrass
Using following settings:
MIN_DIST_OF_CN_SEG_BKPT_TO_RG = 20000
MAX_GET_READS_EXTEND_DIST = 10000
MIN_WINDOW_BIN_COUNT = 10

*****
***** ERROR: Requested column 5, but database file - only has fields 1 - 0.
Error in res[, 4] : subscript out of bounds
Execution halted
2675.70user 50.67system 30:39.84elapsed 148%CPU (0avgtext+0avgdata 817604maxresident)k
0inputs+7552outputs (0major+18587294minor)pagefaults 0swaps

libInstall.R - needs updating if sticking with R3.1.3

VGAM from 1.0-4 requires R3.4.0+, see here.

Need to update script to be along the lines:

instLib = commandArgs(T)[1]

r = getOption("repos") # hard code the UK repo for CRAN
r["CRAN"] = "http://cran.uk.r-project.org"
options(repos = r)
rm(r)
source("http://bioconductor.org/biocLite.R")

ipak <- function(pkg){
  new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
  if (length(new.pkg))
    biocLite(new.pkg, ask=FALSE, lib=instLib)
  sapply(pkg, library, character.only = TRUE)
}

biocPackages <- c("data.table", "gam")
ipak(biocPackages)

install.packages("VGAM_1.0-3.tar.gz", type="source", lib=instLib)

biocPackages <- c("stringr", "poweRlaw", "zlibbioc", "RColorBrewer")
ipak(biocPackages)

This includes a more robust way to add libraries.

Visualising SV vcfs + Calling Fusions.

Hi,

Thank you for processing ICGC data. I have obtained about ~250 patient vcfs from ICGC and I would to call a specific fusion based on delly output caused by a translocation.

1)What should be the best method for this kind of querry?
—> I thought visualising and deciding by my eye. This raised visualisation problem. Neither bedpe output nor vcf output helped out with the visualisation. (IGV and other genome browsers works but its really hard to asses translocation so I wanted to draw a circos plot.)

  1. I have read about using snpEff or VEP(I am not sure about this). But as far as I know, snpEff does not support translocations so it won't work for my case.

  2. Lastly, forgive my ignorance about the following terminology. Bedpe and reedname. Could you give me at least a source about this two file extensions ? (For example what is the difference btw bedpe and vcf. I know what is bed file but I couldnt link these together in structural variation subject.)

Sorry for asking bunch of questions. My graduate project is based on this data therefore, I am trying to process this information as good as possible.

Thank you for your patience,

Best,

Tunc/.

Help needed, velvet jobs being killed due to exonerate error

Hi,

I am stuck at assembly step where exonerate is being invoked and iterations are terminating with the error shown below:

Aligning with --model affine:local for vertices: 3 at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/VelvetGraph.pm line 189. exonerate --verbose 0 --showalignment no --showvulgar no --showcigar no --querytype dna --targettype dna --ryo %qas*Q %qab %qae %qS\n%tas*T %tab %tae %tS %ti\n** %qi %s\n --query /tmp/brassAssembly_IoE7e1/tmpHFkCrf/247-6cjOsk/65/brassAssExonerateHPtSjls.fa --target /tmp/brassAssembly_IoE7e1/tmpHFkCrf/247-6cjOsk/65/brassAssExonerateRef8lFNBiA.fa --percent 80 --model affine:local Bio::Brass::Alignment::_run_exonerate(): Exonerate error: itteration: 1 (of 5), error code: 139, msg: at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/Alignment.pm line 384. Bio::Brass::Alignment::_run_exonerate(): Exonerate error: itteration: 2 (of 5), error code: 139, msg: at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/Alignment.pm line 384. Bio::Brass::Alignment::_run_exonerate(): Exonerate error: itteration: 3 (of 5), error code: 139, msg: at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/Alignment.pm line 384. Bio::Brass::Alignment::_run_exonerate(): Exonerate error: itteration: 4 (of 5), error code: 139, msg: at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/Alignment.pm line 384. Bio::Brass::Alignment::_run_exonerate(): Exonerate error: killed at itteration: 5 (of 5), error code: 139, msg: at /opt/apps/bioinformatics/brass/5.3.3_threaded/lib/perl5/Bio/Brass/Alignment.pm line 384. 2.52user 1.52system 0:24.18elapsed 16%CPU (0avgtext+0avgdata 132268maxresident)k 0inputs+31032outputs (0major+116882minor)pagefaults 0swaps

Complete error file attached.
Sanger_CGP_Brass_Implement_assemble.3.err.txt

Any ideas/suggestions as to what is happening would be greatly appreciated!
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.