him72 / chimerascan Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 2.21 MB

Automatically exported from code.google.com/p/chimerascan

License: GNU General Public License v3.0

Python 95.96% C 0.91% JavaScript 1.90% CSS 0.08% HTML 1.16%

chimerascan's People

Watchers

chimerascan's Issues

Use breakpoint homology to refine junction spanning read counts

Thanks to the deFuse project for this nice idea.

For each candidate chimera, compare the sequence of the candidate 3' partner 
with the sequence of the normal gene.  Count the number of bases of exact 
homology at the junction and include this information when remapping to find 
junction spanning reads.

When tallying the junction spanning reads, only count reads that span further 
than the regions of homology in the 5'/3' partners.

Implementing this additional filter is trivial and should account for a lot of 
the candidates with large numbers of junction-spanning reads with a small 
amount of "anchor".

Not high priority because we have other means of filtering these reads.

Original issue reported on code.google.com by [email protected] on 5 Feb 2011 at 4:57

Fragment size in "detecting discordant reads" step

Dear Chimerascan authors,

I have paired end reads with ~500 bp library size (as set by the wet-lab) with 
104 per Illumina reads. with this design, I was wondering what the fragment 
size will be for the discordant reads detection. Will that be 500bp? or 
something else?

Also, should I specific the insert size or will the program automatically 
estimate insert size based on initial alignments?

Also, is there any way you could include IGV screenshots 
showing a successful chimera detected by the algorithm?

I am running chimerascan 0.4.3 on 64-bit linux

Thank you.

------------

Step 6: Discover discordant reads

The realigned reads are searched for evidence of discordant reads. A discordant 
read is currently defined as follows:

   1. The fragment does not align to the genome within user-specified fragment size range (default: 1000bp)
   2. The pair does not align to a single transcript
   3. The pair does not align to different transcripts that share exonic sequences on the same strand. In otherwords, reads that map to different isoforms of the same gene are excluded, as it is not the goal of chimerascan to discover unannotated splicing patterns within a single gene.

Original issue reported on code.google.com by [email protected] on 12 Oct 2011 at 9:30

crash after "Extracting single-mapped reads that may span breakpoints"

What steps will reproduce the problem?
1. Running on one of my pairs of fastq files always seems to crash in the same 
place
2.
3.

What is the expected output? What do you see instead?
I assume it is crashing, but I dont know why.  Other samples succeeded using 
the same configuration. There is no more information other than the output ends 
with:
2013-10-01 12:16:09,037 - root - INFO - Extracting single-mapped reads that may 
span breakpoints
2013-10-01 12:16:09,039 - root - DEBUG - Matching single-mapped frags to 5' 
chimeras

What version of the product are you using? On what operating system?
chimerascan-0.4.3-1
on Gnu Linux

Please provide any additional information below.
Contents of my tmp directory
3.7G Sep 27 19:28 reads_2.fq
3.7G Sep 27 19:29 reads_1.fq
602M Sep 27 21:07 unaligned_1.fq
602M Sep 27 21:07 unaligned_2.fq
143K Sep 27 21:07 maxmulti_1.fq
143K Sep 27 21:07 maxmulti_2.fq
1.3G Sep 28 00:09 realigned_reads.bam
749M Sep 28 03:52 gene_paired_reads.bam
522M Sep 28 03:52 genome_paired_reads.bam
262M Sep 28 03:52 unmapped_reads.bam
885K Sep 28 03:52 complex_reads.bam
 92M Sep 28 03:55 discordant_reads.bedpe
 92M Sep 28 03:56 discordant_reads.srt.bedpe
5.4G Sep 28 05:18 encompassing_chimeras.txt
1.9G Sep 28 05:34 encompassing_chimeras.filtered.txt
1.9G Sep 28 05:41 encompassing_chimeras.breakpoint_sorted.txt
 58M Sep 28 05:47 breakpoints.fa
 61M Sep 28 05:48 breakpoints.txt
 14M Sep 28 05:48 breakpoints.4.ebwt
2.5M Sep 28 05:48 breakpoints.3.ebwt
 27M Sep 28 05:49 breakpoints.1.ebwt
6.8M Sep 28 05:49 breakpoints.2.ebwt
 27M Sep 28 05:52 breakpoints.rev.1.ebwt
6.8M Sep 28 05:52 breakpoints.rev.2.ebwt
718K Sep 28 05:56 encomp_spanning_reads.fq
 58M Sep 28 06:01 unaligned_spanning_reads.fq
410M Sep 28 06:08 singlemap_reads.srt.bam
4.3M Sep 28 06:08 singlemap_reads.srt.bam.bai
   0 Oct  1 14:42 tmp_singlemap_seqs.txt

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 7:00

Dynamic read trimming of poor quality reads

We have observed certain sets of reads that have poor quality bases at the 3' 
end.  This can cause 3' end segments to be unmapped, and place a greater burden 
on the segmented alignment phase.  Not only is this slower, but it will 
inevitably lead to a greater number of false positives (garbage in => garbage 
out).

Dynamic read trimming is a viable solution here.  The first goal is to follow 
the BWA trimming paradigm. 

Note that this feature cannot be enabled until variable length reads are fully 
supported throughout the pipeline.  There are still areas where fixed length 
reads are assumed and shortcuts are taken.

Original issue reported on code.google.com by [email protected] on 10 Feb 2011 at 4:20

Interconvert between transcriptomic and genomic coordinates

Develop fast scripts to interconvert between transcriptomic and genomic 
coordinates.

This will facilitate building a purely genomic BAM file with mapping results 
for viewing using genome browsers

Original issue reported on code.google.com by [email protected] on 2 Dec 2011 at 5:29

bowtie-build failed to create alignment index

What steps will reproduce the problem?
pgm=~/bin/Chimerascan/chimerascan-0.4.5/bin/chimerascan_index.py
genome=/users/rg/projects/references/Genome/H.sapiens/hg19/Homo_sapiens.GRCh37.c
hromosomes.chr.M.fa
module purge
module load Bowtie/0.12.7-goolf-1.4.10-no-OFED 
time python $pgm $genome hg19.ucsc_genes.txt hg19_ucsc_genes_Index 2> 
hg19_ucsc_genes_Index/chimerascan_index_ucscgenes.err > 
hg19_ucsc_genes_Index/chimerascan_index_ucscgenes.out

What is the expected output? What do you see instead?
The chimerascan index. I do not see it, but those files:
ll hg19_ucsc_genes_Index/
total 4.6G
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo  17M Jan  9 15:44 gene_features.txt
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo 3.2G Jan  9 15:44 align_index.fa
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo 2.7M Jan  9 15:45 align_index.fa.fai
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo 730M Jan  9 15:46 align_index.4.ebwt
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo 651K Jan  9 15:46 align_index.3.ebwt
-rw-r--r-- 1 sdjebali CRG_Lab_Roderic_Guigo    0 Jan  9 15:46 align_index.2.ebwt


What version of the product are you using? On what operating system?
chimerascan-0.4.5 on Linux ant-login5.linux.crg.es 2.6.32-504.1.3.el6.x86_64 #1 
SMP Tue Nov 11 14:19:04 CST 2014 x86_64 x86_64 x86_64 GNU/Linux


Please provide any additional information below.
I ran the chimerascan indexer with the ucsc gene file provided here, the hg19 
genome without haplotypes and Bowtie 0.12.7, and I get the following error 
message after several minutes:

[fai_load] build FASTA index.
2015-01-09 15:45:21,906 - root - INFO - Building bowtie index
2015-01-09 16:04:02,486 - root - ERROR - bowtie-build failed to create 
alignment index

together with a core.

Any idea why?

Original issue reported on code.google.com by [email protected] on 9 Jan 2015 at 5:03

Different length reads for R1 and R2

The sequencing library I am dealing with requires trimming of 15 bases on the 
reverse read only, so I end up with 100 base Fwd read and an 85 base Rev read. 
When I run chimerascan with these files it gives me an error that the pairs are 
different lengths. I decided maybe that was just a sanity check, so I commented 
out the check for equal length pairs and re-ran it. It seems to run ok. Is 
there any issue with this? Do reads have to be the same length?

I did notice that when looking for discordant reads there were some errors 
saying something like:
2014-10-16 16:15:12,503 - root - WARNING - Could not extract sequence of length 
>101 from 5' partner at gene_uc031ret.1:0-42, only retrieved sequence of length 
42
but I am not sure this is related?

Original issue reported on code.google.com by [email protected] on 20 Oct 2014 at 6:57

Indel and rearrangement calling

Single reads mapped as multiple segments may reveal indels and small 
rearrangements.  These can be detected by finding reads with split mappings to 
the same gene, but where the segment mappings are not contiguous.

It would be a relatively small task to add checks for these types of 
aberrations and report them as a complementary module to fusion discovery

Original issue reported on code.google.com by [email protected] on 18 Jan 2011 at 12:57

chimerascan_html_table.py problem

What steps will reproduce the problem?

3.

What is the expected output? What do you see instead?

html file should be created. However, I got:

=======
/usr/local/bin/chimerascan_html_table.py: line 5: 
Created on Feb 12, 2011

@author: mkiyer
: command not found

=======


What version of the product are you using? On what operating system?
v0.3.3. The OS is sles11 sp1

Please provide any additional information below.

standard test following the wiki page

Original issue reported on code.google.com by [email protected] on 25 Feb 2011 at 6:04

Support for variable read lengths

Add support for variable read lengths.  Most of this functionality is already 
built-in, but need to develop unit-tests and try this.

Variable read length support will allow trimming based on read qualities in the 
initial read parsing stage

Original issue reported on code.google.com by [email protected] on 14 Jan 2011 at 6:40

File "/usr/bin/chimerascan_run.py", line 24, in <module>

What steps will reproduce the problem?
i Created the Index and ran the below command earlier
./chimerascan_run.py 
/MGMSTAR1/SHARED/STAFF/KIRAN.P/chimerascan0.4.5/chimerascan/ 

/MGMSTAR1/SHARED/ANALYSIS/NGS_P1216/Tobe_shipped/Transcriptome_PairedEnd/RAW_FAS
TQ/MDB.RI.T40_R1.fastq.gz 

/MGMSTAR1/SHARED/ANALYSIS/NGS_P1216/Tobe_shipped/Transcriptome_PairedEnd/RAW_FAS
TQ/MDB.RI.T40_R2.fastq.gz 

/MGMSTAR1/SHARED/ANALYSIS/NGS_P1216/Fusions/ChimeraScan/MDB.RI.T40
but if i am running the same with or without python in the beginning.



What is the expected output? What do you see instead?

Traceback (most recent call last):
  File "/usr/bin/chimerascan_run.py", line 24, in <module>
    from chimerascan import __version__
ImportError: No module named chimerascan

What version of the product are you using? On what operating system?

chimerascan-0.4.5

what to be done to fix this error if i am doing anywhere wrong

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 12 Aug 2015 at 4:46

Crashed with error "'bowtie-build' failed to create breakpoint index"

What steps will reproduce the problem?
Running chimerascan-0.4.5 with command line arguments with Paired End RNA-Seq 
data

What is the expected output? What do you see instead?
A <filename>.bedpe was expected output.
I see the error "'bowtie-build' failed to create breakpoint index" instead.

What version of the product are you using? On what operating system?
chimerascan-0.4.5, OS is Linux compute000 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 
9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux

Please provide any additional information below.
I am not sure why it failed, because I ran it previously for other sample and 
chimerascan ran for 3-4 days but did give me a result .bedpe file.

Original issue reported on code.google.com by [email protected] on 9 Apr 2014 at 3:22

run from gzipped fastq files

add an option to run chimerascan from gzipped fastq files

Original issue reported on code.google.com by [email protected] on 17 Feb 2011 at 7:40

easy to fix linker error on some systems including macosx

There is a linker error on some system.

All that is required to fix it is adding the static keyword to the following 
inlined function:

https://code.google.com/p/chimerascan/source/browse/tags/chimerascan_v0.4.3.1/ch
imerascan/pysam/samtools/ksort.h#144

Original issue reported on code.google.com by [email protected] on 16 May 2014 at 10:50

IOError: [Errno 24] Too many open files:

What steps will reproduce the problem?
2014-03-11 22:31:04,573 - root - INFO - Sorting BEDPE file
Traceback (most recent call last):
  File "/usr/local/bin/chimerascan_run.py", line 1002, in <module>
    main()
  File "/usr/local/bin/chimerascan_run.py", line 999, in main
    sys.exit(run_chimerascan(runconfig))
  File "/usr/local/bin/chimerascan_run.py", line 659, in run_chimerascan
    tmp_dir=tmp_dir)
  File "/usr/local/lib64/python2.6/site-packages/chimerascan/pipeline/discordant_reads_to_bedpe.py", line 92, in sort_bedpe
    tempdirs=tempdirs)
  File "/usr/local/lib64/python2.6/site-packages/chimerascan/lib/batch_sort.py", line 46, in batch_sort
    output_chunk = open(os.path.join(tempdir,'%06i'%len(chunks)),'w+b',64*1024)
IOError: [Errno 24] Too many open files: 
'/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/1st/t
mp/001020'

It seems that the there is a limits in the number of files that Python can open.


What is the expected output? What do you see instead?
No output the program stops because of the above error


What version of the product are you using? On what operating system?
chimerascan-0.4.5 on Suse Enterprise



Please provide any additional information below.
I attached the nohup.out file of the run.
I also add the commands used to run chimerascan
nohup chimerascan_run.py /home/calogero/bin/chimerascan-0.4.5/index -v -p 64 
--filter-unique-frags=2  --library-type=fr-firststrand 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/MCF7w4
e8spikesmrna_S1_L001_R1_001.fastq 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/MCF7w4
e8spikesmrna_S1_L002_R1_001.fastq 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/1st &

nohup chimerascan_run.py /home/calogero/bin/chimerascan-0.4.5/index -v -p 64 
--filter-unique-frags=2  --library-type=fr-secondstrand 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/MCF7w4
e8spikesmrna_S1_L001_R1_001.fastq 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/MCF7w4
e8spikesmrna_S1_L002_R1_001.fastq 
/home/calogero/Documents/data/fusion_comparison/positive.tests/MCF-7/illu/2nd &

Original issue reported on code.google.com by [email protected] on 12 Mar 2014 at 6:46

Attachments:

nohup.out

Provide chimeric reads as a BAM file

Provide a BAM file in the final output containing the chimeric reads.

Will enable visualization of reads in genome browsers

Original issue reported on code.google.com by [email protected] on 2 Dec 2011 at 5:24

Improve localization of 5' and 3' exons

Use mismatch information (e.g. the MD tag in SAM) to trim read alignments that 
have mismatches in the first/last bases.  This splash occurs when aligning 
chimeric reads that span a junction by <= 2 bases (or the number of allowed 
mismatches in the alignment tool).

Trim the beginning/end of the joined reads to eliminate these leading/trailing 
mismatches and improve alignment. 

Current solution is to simple add/subtract 2 from read alignments during 
chimera nomination phase.

Original issue reported on code.google.com by [email protected] on 17 Jan 2011 at 6:55

Test performance with BWA aligner

Test chimera detection performance with BWA as the initial alignment tool

Original issue reported on code.google.com by [email protected] on 2 Aug 2011 at 3:01

Detailed explanation of chimerascan output

Hello,
      where can i get detailed explanation for header of chimerascan bedpe file. e.g. chimera classes based on orientation of genes


Thanks and Regards,
Pawan

Original issue reported on code.google.com by [email protected] on 20 Aug 2013 at 2:41

csamtools error

Hello,

I am getting an error message when running the indexer:

Traceback (most recent call last):
  File "bin/chimerascan_index.py", line 32, in <module>
    import chimerascan.pysam as pysam
  File "/home/carmenise/chimerascan-0.4.5/chimerascan/pysam/__init__.py", line 1, in <module>
    from csamtools import *
ImportError: No module named csamtools

What should I do to fix it ?

Thank you

Original issue reported on code.google.com by [email protected] on 10 Jun 2013 at 1:06

Add sequences for Poly-A+ tails

Add A's to the end of each transcript in order to align poly-a tail reads and 
improve sensitivity.

Original issue reported on code.google.com by [email protected] on 4 Jul 2011 at 2:41

Sudden crash during runtime

What steps will reproduce the problem?
1. Run analysis using version "0.5.0" from the SVN repo; pysam version "0.7.5"
2. Use default settings and "-p 4" on a 16 core machine

What is the expected output? What do you see instead?
Expected: results, instead: an error message in realigned_reads.log:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/chimerascan/pipeline/sam_to_bam_pesr.py", line 68, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/chimerascan/pipeline/sam_to_bam_pesr.py", line 65, in main
    return sam_to_bam_pesr(args.input_sam_file, args.input_fastq_file, args.output_bam_file)
  File "/usr/local/lib/python2.7/dist-packages/chimerascan/pipeline/sam_to_bam_pesr.py", line 34, in sam_to_bam_pesr
    assert r.qname == fqrec.qname
AssertionError
Error while flushing and closing output
terminate called after throwing an instance of 'int'


What version of the product are you using? On what operating system?
* chimerascan "0.5.0" (from SVN repo)
* pysam "0.7.5" (from tar.gz source)
* Debian, saucy

Please provide any additional information below.
The tool was running for quite some time:

Warning: -M is deprecated.  Use -D and -R to adjust effort instead.
2013-11-04 15:53:41,981 - root - DEBUG - Reading transcript features
2013-11-04 15:53:42,861 - root - DEBUG - Creating genome SAM header
2013-11-04 15:53:45,063 - root - DEBUG - Creating transcript to genome map
2013-11-04 15:53:45,363 - root - DEBUG - Converting transcriptome to genome BAM
61460496 reads; of these:
  61460496 (100.00%) were paired; of these:
    21692002 (35.29%) aligned concordantly 0 times
    20838036 (33.90%) aligned concordantly exactly 1 time
    18930458 (30.80%) aligned concordantly >1 times
64.71% overall alignment rate
2013-11-04 22:07:04,670 - root - DEBUG - Paired fragments: 39768494
2013-11-04 22:07:04,748 - root - DEBUG - Unpaired fragments: 21692002
2013-11-04 22:07:04,758 - root - DEBUG - Found 122920992 fragments



I hope you can fix this issue; it occurs in all my samples.

Best regards,

Youri

Original issue reported on code.google.com by [email protected] on 5 Nov 2013 at 8:47

Detail about output file

Hello,

where I can found a detailed description of all files into the tmp folder?


Thanks,

Best regards

Francesca

Original issue reported on code.google.com by [email protected] on 9 Jan 2014 at 9:51

CHIMERASCAN 0.4.5 runs for hours then gets killed by my system

What steps will reproduce the problem?
1. #CHIMERASCAN-0.4.5 setup
export 
PYTHONPATH=/shared/app/chimerascan-0.4.5/lib64/python2.6/site-packages:$PYTHONPA
TH
export PATH=/shared/app/chimerascan-0.4.5/chimerascan/:$PATH
python /shared/app/chimerascan-0.4.5/chimerascan/chimerascan_run.py -p 8 
/shared/app/BOWTIE/indexes/CHIMERASCAN_INDEXES $a1 $a2 
/home/hazards/Project_DW/Sample_99/Nov14_CHIMERASCAN_OUT
2. $a1 $a2  are variables referring to specific read pair fastq files
3. I am running the program on human lung derived RNASeq fastq generated by an 
Illumina sequencer

What is the expected output? What do you see instead?
chimeras.bedpe was expected

Here's what I see:
.
.
.
2013-11-15 09:18:55,808 - root - WARNING - Could not extract sequence of length 
>101 from 3' partner at gene_uc002ect.2:0-101, only retrieved sequence of 
length 94
2013-11-15 09:18:56,438 - root - WARNING - Could not extract sequence of length 
>101 from 3' partner at gene_uc010vft.1:0-101, only retrieved sequence of 
length 90
2013-11-15 09:18:56,771 - root - WARNING - Could not extract sequence of length 
>101 from 5' partner at gene_uc011mtj.1:0-96, only retrieved sequence of length 
96
2013-11-15 09:19:06,915 - root - WARNING - Could not extract sequence of length 
>101 from 3' partner at gene_uc010zpm.1:1741-1842, only retrieved sequence of 
length 99
2013-11-15 09:19:09,269 - root - INFO - Filtering encompassing chimeras with 
few supporting reads
2013-11-15 09:57:34,874 - root - INFO - Extracting breakpoint sequences from 
chimeras
2013-11-15 10:16:30,598 - root - INFO - Building bowtie index of breakpoint 
spanning sequences
2013-11-15 10:35:32,445 - root - INFO - Extracting encompassing reads that may 
extend past breakpoints
2013-11-15 10:51:26,686 - root - INFO - Separating unmapped and single-mapping 
reads that may span breakpoints
2013-11-15 11:05:38,719 - root - INFO - Extracting single-mapped reads that may 
span breakpoints
/home/hazards/.lsbatch/1384447987.22825.shell: line 42: 10199 Killed            
      python /shared/app/chimerascan-0.4.5/chimerascan/chimerascan_run.py -p 8 
/shared/app/BOWTIE/indexes/CHIMERASCAN_INDEXES 

What version of the product are you using? On what operating system?
Chimerscan 0.4.5
Bowtie 1.0.0
Python 2.6.6
RHEL 6.2 Linux 
LSF 7.2


Please provide any additional information below.
The log files suggest that the initial runs complete through

isize_dist.txt, breakpoint_bowtie_index.log, and tmp_singlemap_seqs.txt

-rw-r--r-- 1 hazards hazards       1446 Nov 15 07:23 runconfig.xml
-rw-r--r-- 1 hazards hazards 5449339058 Nov 15 09:33 aligned_reads.bam
-rw-r--r-- 1 hazards hazards 7386269125 Nov 15 10:45 sorted_aligned_reads.bam
-rw-r--r-- 1 hazards hazards    9578704 Nov 15 10:48 
sorted_aligned_reads.bam.bai
-rw-r--r-- 1 hazards hazards       6613 Nov 15 10:48 isize_dist.txt
drwxr-xr-x 2 hazards hazards        104 Nov 16 20:22 log
drwxr-xr-x 2 hazards hazards       8192 Nov 16 21:38 tmp


Sample_86/Nov14_CHIMERASCAN_OUT/log:
total 24
-rw-r--r-- 1 hazards hazards   465 Nov 15 09:33 bowtie_alignment.log
-rw-r--r-- 1 hazards hazards   457 Nov 15 15:07 bowtie_trimmed_realignment.log
-rw-r--r-- 1 hazards hazards 12663 Nov 16 20:53 breakpoint_bowtie_index.log

Sample_86/Nov14_CHIMERASCAN_OUT/tmp:
total 110869964
-rw-r--r-- 1 hazards hazards  5797564944 Nov 15 08:03 reads_2.fq
-rw-r--r-- 1 hazards hazards  5797564944 Nov 15 08:03 reads_1.fq
-rw-r--r-- 1 hazards hazards  1792743679 Nov 15 09:32 unaligned_1.fq
-rw-r--r-- 1 hazards hazards  1792743679 Nov 15 09:32 unaligned_2.fq
-rw-r--r-- 1 hazards hazards      193024 Nov 15 09:32 maxmulti_1.fq
-rw-r--r-- 1 hazards hazards      193024 Nov 15 09:32 maxmulti_2.fq
-rw-r--r-- 1 hazards hazards  4876081081 Nov 15 15:07 realigned_reads.bam
-rw-r--r-- 1 hazards hazards  3120178382 Nov 15 23:38 gene_paired_reads.bam
-rw-r--r-- 1 hazards hazards  2072258813 Nov 15 23:38 genome_paired_reads.bam
-rw-r--r-- 1 hazards hazards   635025773 Nov 15 23:38 unmapped_reads.bam
-rw-r--r-- 1 hazards hazards      934624 Nov 15 23:38 complex_reads.bam
-rw-r--r-- 1 hazards hazards   295327399 Nov 15 23:45 discordant_reads.bedpe
-rw-r--r-- 1 hazards hazards   295327399 Nov 15 23:45 discordant_reads.srt.bedpe
-rw-r--r-- 1 hazards hazards 49046649769 Nov 16 17:34 encompassing_chimeras.txt
-rw-r--r-- 1 hazards hazards 17552465613 Nov 16 19:26 
encompassing_chimeras.filtered.txt
-rw-r--r-- 1 hazards hazards 17552465613 Nov 16 19:47 
encompassing_chimeras.breakpoint_sorted.txt
-rw-r--r-- 1 hazards hazards   603862877 Nov 16 20:22 breakpoints.fa
-rw-r--r-- 1 hazards hazards   652818329 Nov 16 20:22 breakpoints.txt
-rw-r--r-- 1 hazards hazards   140401371 Nov 16 20:23 breakpoints.4.ebwt
-rw-r--r-- 1 hazards hazards    25023302 Nov 16 20:23 breakpoints.3.ebwt
-rw-r--r-- 1 hazards hazards   234699054 Nov 16 20:39 breakpoints.1.ebwt
-rw-r--r-- 1 hazards hazards    70200692 Nov 16 20:39 breakpoints.2.ebwt
-rw-r--r-- 1 hazards hazards   234699054 Nov 16 20:53 breakpoints.rev.1.ebwt
-rw-r--r-- 1 hazards hazards    70200692 Nov 16 20:53 breakpoints.rev.2.ebwt
-rw-r--r-- 1 hazards hazards     3878595 Nov 16 21:27 encomp_spanning_reads.fq
-rw-r--r-- 1 hazards hazards   212324326 Nov 16 21:32 
unaligned_spanning_reads.fq
-rw-r--r-- 1 hazards hazards   649644709 Nov 16 21:38 singlemap_reads.srt.bam
-rw-r--r-- 1 hazards hazards     5238976 Nov 16 21:38 
singlemap_reads.srt.bam.bai
-rw-r--r-- 1 hazards hazards           0 Nov 16 21:38 tmp_singlemap_seqs.txt

I run the program on a local research cluster whose nodes have either 16Mb ram 
or 24Gb ram. restricting to the 24Gb ram machines has no effect. ie the program 
still runs and gets killed. I would assume that there is a size limitation but 
the some files as large as 5-7 Gb have completed while others fail as 
described. Re-running the jobs that fail repeats the failure.

Original issue reported on code.google.com by [email protected] on 18 Nov 2013 at 9:59

Improvement when joining segments

Assume for segments 1,2,3,4.  If segment 2 is unmapped due to multimapping the 
joiner will output split reads for segment 1, segment 2, and segments 3-4.  If 
we check segment 2 for the multimapping (XM) tag in bowtie and allow the 
segment joining logic to skip segments, we could potentially recover segment 2 
(infer its position based on the positions of the other segments) and not need 
to report a split.

Original issue reported on code.google.com by [email protected] on 15 Jan 2011 at 8:21

new feature? possibility of running the filter chimeras as a stand alone step?

Hi, this is not really a issue. It is rather a new feature.

I am wondering if is possible to run the filtering (step 12) as a standalone 
step. We would like to play with the filtering part of the parameters a bit but 
it seems like a waste of computation if we just rerun the whole pipeline.

Thanks,
Yupu

Original issue reported on code.google.com by [email protected] on 20 Sep 2012 at 6:55

him72 / chimerascan Goto Github PK

chimerascan's People

Watchers

chimerascan's Issues

Recommend Projects

Recommend Topics

Recommend Org