Code Monkey home page Code Monkey logo

quackers's Introduction

quackers

First release: June 13, 2024

Parkinson Lab's in-house MG pipeline. Setup: Need to download a few databases: https://github.com/Ecogenomics/CheckM/wiki/Installation#how-to-install-checkm https://data.ace.uq.edu.au/public/CheckM_databases/

https://ecogenomics.github.io/GTDBTk/installing/index.html#installing-gtdbtk-reference-data WARNING: it's 84GB, and you need to unzip it yourself.

Also: use the sample config, and assign the 2 databases entries to the paths. special note for gtdbtk, use release220, and have the config point to it.

Also: your host fastas need to be indexed by BWA before use. else quackers will complain.

quackers's People

Contributors

billytaj avatar

Stargazers

 avatar

Watchers

 avatar  avatar

quackers's Issues

Paired Host Filtered fastq are uneven

After running step 0 and step 1 (0_clean_reads and 1_host_filter), the fastq files in the export folder are not equal in file size. The pipeline continues despite this causing issues later in the pipe.

This does not seem to be a memory issue as files are still generated later in the pipe, and other steps (2, 3b etc) are run and generate bins. Not sure how to proceed.

`/export/pair_2.fastq
[main] Real time: 66797.744 sec; CPU: 66866.546 sec
/quackers_tools/SPAdes/share/spades/spades_pipeline/support.py:488: SyntaxWarning: invalid escape sequence '\d'
return [atoi(c) for c in re.split("(\d+)", text)]
Process Process-5:
Traceback (most recent call last):
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/scratch/j/jparkin/neelnjay/toolbox/quackers/quackers-1.0.1/MetaPro_utilities.py", line 294, in create_and_launch_v2
sp.check_output(["sh", job_path])#, stderr = sp.STDOUT)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sh', '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/assemble_p.sh']' returned non-zero exit status 2.
.
2024-08-05 22:18:43 - MEGAHIT v1.2.9
2024-08-05 22:18:43 - Using megahit_core with POPCNT and BMI2 support
2024-08-05 22:18:43 - Convert reads to binary library
2024-08-05 22:18:58 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 0 (/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/1_host_filter/export/pair_1.fastq,/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/1_host_filter/export/pair_2.fastq): pe, 16248758 reads, 151 max length'
2024-08-05 22:18:58 - b'INFO utils/utils.h : 152 - Real: 15.4370\tuser: 14.4036\tsys: 2.9472\tmaxrss: 248644'
2024-08-05 22:18:58 - k-max reset to: 141
2024-08-05 22:18:58 - Start assembly. Number of CPU threads 80
2024-08-05 22:18:58 - k list: 21,29,39,59,79,99,119,141
2024-08-05 22:18:58 - Memory used: 182217052569
2024-08-05 22:18:58 - Extract solid (k+1)-mers for k = 21
2024-08-05 22:19:19 - Build graph for k = 21
2024-08-05 22:22:57 - Assemble contigs from SdBG for k = 21
2024-08-05 22:29:25 - Local assembly for k = 21
2024-08-05 22:30:15 - Extract iterative edges from k = 21 to 29
2024-08-05 22:31:24 - Build graph for k = 29
2024-08-05 22:31:42 - Assemble contigs from SdBG for k = 29
2024-08-05 22:33:39 - Local assembly for k = 29
2024-08-05 22:34:29 - Extract iterative edges from k = 29 to 39
2024-08-05 22:34:38 - Build graph for k = 39
2024-08-05 22:34:46 - Assemble contigs from SdBG for k = 39
2024-08-05 22:36:08 - Local assembly for k = 39
2024-08-05 22:37:14 - Extract iterative edges from k = 39 to 59
2024-08-05 22:37:18 - Build graph for k = 59
2024-08-05 22:37:24 - Assemble contigs from SdBG for k = 59
2024-08-05 22:38:10 - Local assembly for k = 59
2024-08-05 22:39:46 - Extract iterative edges from k = 59 to 79
2024-08-05 22:39:49 - Build graph for k = 79
2024-08-05 22:39:55 - Assemble contigs from SdBG for k = 79
2024-08-05 22:40:29 - Local assembly for k = 79
2024-08-05 22:42:15 - Extract iterative edges from k = 79 to 99
2024-08-05 22:42:18 - Build graph for k = 99
2024-08-05 22:42:21 - Assemble contigs from SdBG for k = 99
2024-08-05 22:42:49 - Local assembly for k = 99
2024-08-05 22:44:21 - Extract iterative edges from k = 99 to 119
2024-08-05 22:44:24 - Build graph for k = 119
2024-08-05 22:44:27 - Assemble contigs from SdBG for k = 119
2024-08-05 22:44:47 - Local assembly for k = 119
2024-08-05 22:46:23 - Extract iterative edges from k = 119 to 141
2024-08-05 22:46:25 - Build graph for k = 141
2024-08-05 22:46:31 - Assemble contigs from SdBG for k = 141
2024-08-05 22:46:51 - Merging to output final contigs
2024-08-05 22:46:52 - 253926 contigs, total 131157275 bp, min 200 bp, max 703751 bp, avg 516 bp, N50 493 bp
2024-08-05 22:46:52 - ALL DONE. Time elapsed: 1689.330930 seconds
Building a SMALL index
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.3.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.3.bt2
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.4.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.4.bt2
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.1.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.1.bt2
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.2.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.2.bt2
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.rev.1.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.rev.1.bt2
Renaming /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.rev.2.bt2.tmp to /scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_idx/contigs.rev.2.bt2
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_CTYPE = "C.UTF-8",
LANG = "en_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Error, fewer reads in file specified with -1 than in file specified with -2
terminate called after throwing an instance of 'int'
Aborted (core dumped)
(ERR): bowtie2-align exited with value 134
Process Process-8:
Traceback (most recent call last):
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/scratch/j/jparkin/neelnjay/toolbox/quackers/quackers-1.0.1/MetaPro_utilities.py", line 294, in create_and_launch_v2
sp.check_output(["sh", job_path])#, stderr = sp.STDOUT)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sh', '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/clean_reads.sh']' returned non-zero exit status 134.
[E::aux_parse] Incomplete aux field
[W::sam_read1_sam] Parse error at line 16475769
samtools view: error reading file "/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/raw_bt2_out.sam"
Traceback (most recent call last):
File "/quackers_pipe/scripts/contig_reconcile.py", line 184, in
sam_hits_dict, unique_hosts = sort_samfiles(sam_score_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/quackers_pipe/scripts/contig_reconcile.py", line 87, in sort_samfiles
with open(sam_path, "r") as sam_in:
^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/score_bt2.out'
[E::hts_open_format] Failed to open file "/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_sort.bam" : No such file or directory
ERROR: fail to open index BAM file '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/7/2_contig_assemble/data/bt2_sort.bam'
Traceback (most recent call last):
File "/quackers_pipe/modded_scripts/concoct_coverage_table.py", line 100, in
generate_input_table(args.bedfile, args.bamfiles, samplenames=samplenames)
File "/quackers_pipe/modded_scripts/concoct_coverage_table.py", line 46, in generate_input_table
raise Exception('Error with running samtools bedcov')
Exception: Error with running samtools bedcov
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 573, in _build_master
ws.require(requires)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 891, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 2.0.0 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('numpy<2,>=1.22.4'), {'pandas'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/concoct", line 4, in
import('pkg_resources').run_script('concoct==1.1.0', 'concoct')
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 3267, in
def _initialize_master_working_set():
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 3241, in _call_aside
f(*args, **kwargs)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 3279, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 575, in _build_master
return cls._build_from_requirements(requires)
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 588, in _build_from_requirements
dists = ws.resolve(reqs, Environment())
File "/usr/lib/python3/dist-packages/pkg_resources/init.py", line 782, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (numpy 2.0.0 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('numpy<2,>=1.22.4'), {'pandas'})
Matplotlib created a temporary cache directory at /tmp/matplotlib-wig60wm8 because the default path (/home/j/jparkin/neelnjay/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing. `

2_contig_assembly Error

When running quackers errors are being shown in step 2 (2_contig_assembly). The clean_reads_reconcile.sh script and clean_reads.sh script are showing 'perl error' and 'float split' errors. This may also be related to sam reorganization or sorting (as the error file says 'sam_clean.py' is being run). I have attached part of the error below.

Finding the cause of this issue is difficult to trace, as the generated txt files are not a robust checkpoint.

A feature that may be useful is a check in before moving to the next step, and ending if the files were not generated. This checkpointing can be written all to one file rather than checkpoint files.

2_contig_assembly reconcile error:
[bam_sort_core] merging from 24 files and 1 in-memory blocks... Traceback (most recent call last): File "/quackers_pipe/scripts/contig_reconcile.py", line 184, in <module> sam_hits_dict, unique_hosts = sort_samfiles(sam_score_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/quackers_pipe/scripts/contig_reconcile.py", line 101, in sort_samfiles old_score = float(old_hit.split("|")[1]) ^^^^^^^^^^^^^ AttributeError: 'float' object has no attribute 'split' Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 573, in _build_master ws.require(__requires__) File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 891, in require needed = self.resolve(parse_requirements(requirements)) File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 782, in resolve raise VersionConflict(dist, req).with_context(dependent_req) pkg_resources.ContextualVersionConflict: (numpy 2.0.0 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('numpy<2,>=1.22.4'), {'pandas'})

2_contig_assembly clean reads error:
`/quackers_tools/SPAdes/share/spades/spades_pipeline/support.py:488: SyntaxWarning: invalid escape sequence '\d'
return [atoi(c) for c in re.split("(\d+)", text)]
Process Process-5:
Traceback (most recent call last):
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/scratch/j/jparkin/neelnjay/toolbox/quackers/quackers-1.0.1/MetaPro_utilities.py", line 294, in create_and_launch_v2
sp.check_output(["sh", job_path])#, stderr = sp.STDOUT)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sh', '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/assemble_p.sh']' returned non-zero exit status 2.
...

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_CTYPE = "C.UTF-8",
LANG = "en_CA.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Error, fewer reads in file specified with -1 than in file specified with -2
terminate called after throwing an instance of 'int'
Aborted (core dumped)
(ERR): bowtie2-align exited with value 134
Process Process-8:
Traceback (most recent call last):
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/scratch/j/jparkin/neelnjay/toolbox/quackers/quackers-1.0.1/MetaPro_utilities.py", line 294, in create_and_launch_v2
sp.check_output(["sh", job_path])#, stderr = sp.STDOUT)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['sh', '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/clean_reads.sh']' returned non-zero exit status 134.
[E::sam_parse1] SEQ and QUAL are of different length
[W::sam_read1_sam] Parse error at line 9340514
samtools view: error reading file "/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/data/raw_bt2_out.sam"
Traceback (most recent call last):
File "/quackers_pipe/scripts/contig_reconcile.py", line 184, in
sam_hits_dict, unique_hosts = sort_samfiles(sam_score_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/quackers_pipe/scripts/contig_reconcile.py", line 87, in sort_samfiles
with open(sam_path, "r") as sam_in:
^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/data/score_bt2.out'
[E::hts_open_format] Failed to open file "/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/data/bt2_sort.bam" : No such file or directory
ERROR: fail to open index BAM file '/scratch/j/jparkin/neelnjay/hlhs/quackers_101_output/2/2_contig_assemble/data/bt2_sort.bam'
Traceback (most recent call last):
File "/quackers_pipe/modded_scripts/concoct_coverage_table.py", line 100, in
generate_input_table(args.bedfile, args.bamfiles, samplenames=samplenames)
File "/quackers_pipe/modded_scripts/concoct_coverage_table.py", line 46, in generate_input_table
raise Exception('Error with running samtools bedcov')`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.