Comments (5)
Hey Sam,
Good question! Yes, Snakemake will figure out which rules need to be run based on the presence/absence of output/input files based on the target rule. It will not re-run the jobs for already assembled samples, unless you delete/move the output file, or if you use the Snakeamke -R
flag.
For example, lets say my target rule is assembly, so you would run something like e.g. bash metaGEM.sh -t megahit -j 43 -c 32
. Let's look at the megahit
rule in the Snakefile:
Lines 273 to 278 in d81186a
First of all, Snakemake will make sure that the inputs for this rule are present. So if the samples have not been quality filtered, then Snakemake would submit 43 qfilter jobs + 43 assembly jobs.
Now let's say that the samples have all been quality filtered in a previous run, then Snakemake will check if the output of the target rule is present, i.e. it will search the assemblies/
subfolders for files called contigs.fasta.gz
. In your scenario you said you had 10 assemblies completed, so if they are present in the specified location, then Snakemake would only submit 33 assembly jobs.
Some useful troubleshooting tips:
- Double check what jobs will be submitted by running the
metaGEM.sh
script, as it will always dry-run jobs before asking you if they look good for submission. - Alternatively, check this manually by running
snakemake all -n
in yourmetaGEM
folder. - Sanity check by using
touch
to create dummy output files, then dryrun to see if you tricked Snakemake into thinking that the files have already been generated. Remember to delete the dummy files afterwards!
Best,
Francisco
from metagem.
Hi Francisco,
Thank you for your answer. Copying the assemblies/
subfolders did not work. metaGEM recognized the samples that were assembled on the same cluster, but not the others that were assembled on the other cluster and copied over. Maybe I should also copy files for the intermediate results folder?
Also, it seems that when metaGEM then starts a task for a sample that was previously run on a different machine, the result folder that is already present (the one I copied over) for that sample gets deleted.
Best,
Sam
from metagem.
Hey Sam,
Did metaGEM
try to submit quality filtering jobs for the samples who's assemblies got deleted? Did you have all the qfiltered/
result files (including the ones for the samples that were assembled on your local machine) on your new cluster? Similarly, does you dataset/
folder contain all you samples? You need to have these files present, otherwise Snakemake will try to re-create them before running your target rule.
from metagem.
Hi francisco,
No metaGEM
did not try to submit qc jobs for those samples. All the samples were qfiltered
on both machines, and the dataset
folder contained all the samples. As a work around, I temporarily moved the samples that were already assembled from the dataset
folder and restarted.
from metagem.
I see, sorry to hear you were having trouble with this, but glad that you figured out the workaround!
from metagem.
Related Issues (20)
- CompositionVis & modelVis output HOT 7
- Job submission with qsub HOT 2
- [Bug]: 'BiGG_gene' is both an index level and a column label, which is ambiguous. HOT 4
- [Usage]: running workflow on workstation with local flag HOT 17
- Questions about media, gapfilling, and predicting interactions HOT 11
- Getting the following error while running the bash metaGEM.sh -t check
- Getting the following error while running the bash metaGEM.sh -t check HOT 8
- refined_bins output remains empty after successful binRefine step HOT 2
- [Question]: How to define and construct a custom culture medium component that can be recognized by CarveMe? HOT 2
- [Question]:Why, when I use CarveMe for gap-filling, does it show that my custom medium does not exist in the database? HOT 1
- [Question]: I meet some errors when I use CarveMe for gap-filling? HOT 3
- [Question]: How to use the GEM output of CarveMe to generate these two files? HOT 1
- [Bug]: Metawrap Installation failure HOT 5
- ERROR when using GTDBTK HOT 2
- maintenance: check bonus tool implementation in Snakefile and wrapper
- crossmap with multiple threads HOT 2
- Implementation of EukRep in the Snakemake pipeline HOT 4
- How to set media to interpret and compare the metabolic interactions at different habitats? HOT 5
- abundance | samtools view: failed to add PG line to the header HOT 2
- dir_util.py AttributeError: 'dict' object has no attribute 'add' HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from metagem.