Comments (9)
Looking at my older runs of WASP, it seems that reads were output multiple times in the fastq files in the past as well. I guess the bug here may be that this read pair doesn't make it through the filtering step even though it aligns to the same spot. I've added files to the zip that show this.
from wasp.
Hi Chris, thanks for the bug report, looking into this now...
from wasp.
I think I found the problem and have committed a fix here:
b1e8219
Thanks again for the bug report and let us know if you have any further issues.
from wasp.
Thanks Graham. It seems the results (e.g. whether a read pair is kept or not) from the mapping pipeline weren't affected by this bug right?
from wasp.
I am not 100% certain, but unfortunately I think that it could have
affected which paired end reads are filtered. I think that some PE reads
may have dropped out of the pipeline even though they could have been kept.
The other outstanding issue is that Step #5 (rmdup) does not currently
support PE reads. I am working to fix this now.
On Thu, Jun 25, 2015 at 12:50 PM, Christopher DeBoever <
[email protected]> wrote:
Thanks Graham. It seems the results (e.g. whether a read pair is kept or
not) from the mapping pipeline weren't affected by this bug right?—
Reply to this email directly or view it on GitHub
#18 (comment).
from wasp.
It turns out the 'fix' I made was not correct and has created some issues with the PE reads. I have reverted to the old version and I am working on fixing the original issue (which was minor by comparison).
from wasp.
Sounds good, I was actually looking at the code last week although so far
I've mostly just added comments. I'm hoping to start refactoring a bit
tomorrow and adding in some unit tests.
On Mon, Jul 27, 2015 at 2:39 PM, Graham McVicker [email protected]
wrote:
It turns out the 'fix' I made was not correct and has created some issues
with the PE reads. I have reverted to the old version and I am working on
fixing the original issue (which was minor by comparison).—
Reply to this email directly or view it on GitHub
#18 (comment).
from wasp.
I've been able to clean up the code a bit and add a lot of documentation and some tests (0170a01). I actually looked into this bug and it turns out it's not a bug. The two reads both overlap the SNP so the three possible read pairs are output. I added a test for the data I provided initially.
I can make a pull request, but I was also wondering if we could add an option to specify that the input bam file is already coordinate sorted? I can add that in before I make the pull request.
from wasp.
Hi Chris,
That changes and test look great. You are welcome to add an option to
indicate that the input bam is already sorted. Once you are ready to make a
pull request we can accept it.
Thanks a lot for your help!
Graham
On Mon, Aug 3, 2015 at 3:17 PM, Christopher DeBoever <
[email protected]> wrote:
I've been able to clean up the code a bit and add a lot of documentation
and some tests (0170a01
0170a01).
I actually looked into this bug and it turns out it's not a bug. The two
reads both overlap the SNP so the three possible read pairs are output. I
added a test for the data I provided initially.I can make a pull request, but I was also wondering if we could add an
option to specify that the input bam file is already coordinate sorted? I
can add that in before I make the pull request.—
Reply to this email directly or view it on GitHub
#18 (comment).
from wasp.
Related Issues (20)
- Missing step in CHT readme?
- Many reads discarded due to remapping with different CIGAR HOT 6
- CHT Snakemake workflow getting stuck at adjust_read_counts
- VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated
- find_intersecting_snps.py outputting sam files rather than bam files HOT 2
- Small sample size HOT 3
- filter_remapped_reads.py HOT 2
- find_intersecting_snps.py HOT 3
- Minor typo in Snakefile
- Represent phased blocks in VCF file HOT 1
- find_intersecting_snps.py: for pair in product(new_reads[0][group], new_reads[1][group]): IndexError: list index out of range HOT 1
- Whole genome sequencing data HOT 1
- running filter_remapped_reads.py on multiple samples all get empty bam files
- Incomplete h5 files with snp2h5
- filter_remapped_reads.py with bams created by HISAT2/ other mappers than Bowtie2 HOT 2
- running snp2h5 in the mapping snakemake HOT 1
- error in rule "get_as_counts" / bam2h5.py
- VCF to HDF5 Error
- Will rmdup_pe.py behave normally if one mate from a read pair is removed? HOT 1
- about NAs and 0s in CHT step 6 "extract_haplotype_read_counts"
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wasp.