Comments (6)
Please search for the error on the internet before posting an issue.
https://www.biostars.org/p/186585/
Do ls -l
on the genome data directory for bwa index and .fa file. Also please post a full log.
from chipseq_pipeline.
I am attaching the full nohup output and error files (I was not aware of the attach option, have just started using github). This time I ran with 1 replicate and 1 control replicate and got same error.
Regarding the biostars thread, it turns out that I had gone through this and some other biostars posts for a few days before I posted the issue. Accordingly I had checked my fastq files for pairing of reads. This is why I mentioned in my original post that
"I checked that the fastq files (i.e. for both replicate and control replicate, i checked the pairing of reads, between read1 and read2, it looks ok)."
To be specific I checked all sequences are of same length (100) and all read1 and read2-ID-s are paired correctly. I may still be missing something in that biostars thread, please point it out (it happens that I am sort of new to sequencing analysis and terminology).
Regarding, the genome-data folder I had come across that suggestion somewhere and checked that it has all the files ".fa", ".fai", and bwa_index has amb, ann, bwt, pac and sa files. Pasting the "ls -l" here
sb1@sb-hpz800:bwa_index $ ls -l
total 5290524
lrwxrwxrwx 1 sb1 sb1 15 Apr 10 10:23 male.hg19.fa -> ../male.hg19.fa
-rw-rw-r-- 1 sb1 sb1 6548 Apr 10 11:10 male.hg19.fa.amb
-rw-rw-r-- 1 sb1 sb1 944 Apr 10 11:10 male.hg19.fa.ann
-rw-rw-r-- 1 sb1 sb1 3095694072 Apr 10 11:10 male.hg19.fa.bwt
-rw-rw-r-- 1 sb1 sb1 773923497 Apr 10 11:10 male.hg19.fa.pac
-rw-rw-r-- 1 sb1 sb1 1547847040 Apr 10 11:24 male.hg19.fa.sa
sb1@sb-hpz800:hg19 $ ls -l
total 3997972
drwxr-xr-x 2 sb1 root 4096 Apr 9 17:04 ataqc
drwxrwxr-x 2 sb1 sb1 4096 Apr 10 11:24 bwa_index
-rw-rw-r-- 1 sb1 sb1 376 Apr 10 10:23 hg19.chrom.sizes
-rw-rw-r-- 1 sb1 sb1 3157608038 Apr 10 10:19 male.hg19.fa
-rw-rw-r-- 1 sb1 sb1 788 Apr 10 10:23 male.hg19.fa.fai
-rw-r--r-- 1 sb1 root 936272240 Jan 28 2010 male.hg19.fa.gz
drwxr-xr-x 2 sb1 root 4096 Apr 10 10:23 seq
-rw-r--r-- 1 sb1 root 4731 May 5 2011 wgEncodeDacMapabilityConsensusExcludable.bed.gz
Oh and at the end of the run I am getting valid T0L1R1.PE2SE.sam.gz file. It looks complete but no sam file or any sai is seen for the second read R2. For rep1, I get all files including bam, tagAlign etc for read 1, but nothing for read 2.
Please suggest how I can go about debugging this. Thanks again for patiently reading all this.
from chipseq_pipeline.
Did you have enough disk space on your temporary directories and working dir?
$ echo $TMP
$ echo $TMPDIR
$ df -h $TMP
$ df -h $TMPDIR
How much memory do you have on your system? If it's a cluster and you are submitting jobs to a cluster engline like SGE or SLURM then use higher memory settings with -mem_bwa 30G
. Let's see if this helps.
from chipseq_pipeline.
Both $TMP and $TMPDIR are null (undefined) in my bash shell. But since the other replicates have run successfully, I guess that is not a problem. I think there is no storage issue.
[sb1@bladeamd-4 ssg-chipseq]$ df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/data 1.1P 453T 667T 41% /data
[sb1@bladeamd-4 ssg-chipseq]$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos_bladeamd--4-root 100G 6.8G 94G 7% /
I am using a single blade server, sometimes also a workstation (same error in both). On the server
MemTotal: 65774440 kB
MemFree: 48467220 kB
MemAvailable: 52434928 kB
There may be 1 or 2 other users running jobs occasionally (not in my control). On the workstation, I am the only user.
MemTotal: 24671836 kB
MemFree: 21266696 kB
As you suggested I tried mem_bwa 30G on the server and got the same error. What should be the number of threads "nth" ? This is lscpu on the server and workstation.
Server
CPU(s): 6
Thread(s) per core: 1
Core(s) per socket: 6
Workstation
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 2
Core(s) per socket: 6
Another suspicion I have is that for this replicate somehow wall time is being crossed after creating .sai for 1'st read. I am trying this now with nth=8 on the server:
"wt_bwa" : "71h",
"mem_bwa" : "30G",
Is there any way to skip the index already done for read1 and proceed to read 2 ?
from chipseq_pipeline.
Sorry it seems there are two unrelated issues here. I discovered that although my other runs were completed without error, they only gave files for R1 (read 1). This one had probably failed because of memory or storage or wall time issue. I am running with higher memory etc as you suggested and it has gone closer to completion (not yet completed). so i will close this issue. the primary issue now is why it is only taking the read1's separately. so i will probably post the issue with a different heading. thanks.
from chipseq_pipeline.
FYI, read1 (R1) fastq is separately processed (as single ended) for cross-correlation analysis only and it's not used for other downstream analyses like peak calling and IR.
from chipseq_pipeline.
Related Issues (20)
- aquas_chipseq pipline SPP error HOT 6
- Trimmed data-sets HOT 1
- Read length HOT 2
- shuf: end of file HOT 8
- "-q [SLURM_ACCOUNT_NAME]" interpreted as file HOT 1
- ERROR:root:--extsize must >= 1! HOT 2
- Fatal error: chipseq.bds, line 1354, pos 2. Task/s failed. HOT 4
- dead loop HOT 18
- job submission command HOT 2
- [Error] two same tag align file in (z)cat and shuf in pipe HOT 1
- Missing `-c` for `macs2 peakcall` ? HOT 1
- pipeline deprecated? HOT 2
- Can't download hg19 genome data HOT 2
- Some reps get ERROR:root:--extsize must >= 1! HOT 2
- Documentation for macs2? HOT 2
- Citing the AQUAS Pipeline HOT 1
- ERROR:root:--extsize must >= 1! and chipseq.bds can not get all replicates go through HOT 25
- mm10 HOT 1
- JSD error: "RuntimeError: module compiled against API version 0xc but this version of numpy is 0xa" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chipseq_pipeline.