Code Monkey home page Code Monkey logo

Comments (9)

franciscozorrilla avatar franciscozorrilla commented on September 3, 2024

Hi Karine,

You are getting that error because your machine does not recognize the Slurm cluster command sbatch. The metaGEM.sh parser is mainly designed for interfacing with a high performance computer cluster, as its main task is to submit jobs to the scheduler. The organizeData and downloadToy rules worked for you because they are submitted locally (i.e. not as a job but would run on the login node of the cluster). The pipeline interface was designed with cluster usage in mind considering the large sizes of metagenomic datasets and the computational resources required for assembling short reads from such datasets. In theory you should be able to run metaGEM on your local machine by appropriately modifying the rule all input on line 22, but this will likely not be very practical outside of running small subsets/toy examples. Do you have access to your institution's high performance computer cluster? I would definitely recommend running metaGEM on the cluster for a real metagenomic dataset.

For the sake of explanation, if you wanted to run quality filtering jobs on your local computer then you would replace line 22 of your Snakefile with the output of the qfilter rule in the Snakefile, i.e. :
expand(f'{config["path"]["root"]}/{config["folder"]["qfiltered"]}/{{IDs}}/{{IDs}}_R1.fastq.gz', IDs=IDs) (line 150). Then you can try running snakemake all -n to have a test dry-run to see if the wildcards are being expanded correctly based on your sample IDs. This editing/modification of the rule all input on line 22 is part of the job of the metaGEM.sh parser, but I essentially hardcoded which tasks should be submitted locally and which tasks should be submitted to the cluster. If it is helpful for you I can add an option to the metaGEM.sh parser to force it to submit certain jobs/tasks locally.

Thanks for your interest in metaGEM and please let me know if you have any follow up questions.
Best wishes,
Francisco

from metagem.

karinedurand avatar karinedurand commented on September 3, 2024

I thought it could work on a computer because in the automatic installation it was indicated that it was possible to install it on a local machine.
So thanks for the Flag!
Best
Karine

from metagem.

Lucas-Maciel avatar Lucas-Maciel commented on September 3, 2024

Hi @franciscozorrilla,

I work with an HPC that doesn't use Slurm to run our jobs, so this flag would be useful for HPCs as well.

from metagem.

franciscozorrilla avatar franciscozorrilla commented on September 3, 2024

Hi @karinedurand, yes you can install it locally for running small tests or a small subset of samples, but I want to highlight the fact that it will not be practical for your to process entire real metagenomic datasets (on the order of terabytes of data) on your local machine. I would strongly recommend you obtain access to a computer cluster for analysis of metagenomic data with metaGEM.

Hi @Lucas-Maciel, what workload manager does your cluster use? We can probably easily modify the metaGEM.sh parser to have a flag (e.g. --LSF) for submitting jobs in clusters that have your workload manager. This is probably a better solution for you since you wouldn't want to be running all your jobs on the login node of the cluster.

Best,
Francisco

from metagem.

Lucas-Maciel avatar Lucas-Maciel commented on September 3, 2024

@franciscozorrilla our HPC is using SGE.

from metagem.

franciscozorrilla avatar franciscozorrilla commented on September 3, 2024

@Lucas-Maciel, OK thanks for the info. I confess that have never used SGE/OGE, and I do not have access to a cluster with that workload manager, so I may need your help to double check that everything is running smoothly. Based on some documentation that I am reading, it seems like it should be quite trivial to add support for SGE/OGE. If I understand correctly qsub is the SGE/OGE equivalent of sbatch? You could try replacing every occurrence of sbatch with qsub in the metaGEM.sh file (e.g. sed -i 's/sbatch/qsub/g' metaGEM.sh on the command line, or using the find and replace function of your favorite text editor). If you could try that for me and let me know that everything works as expected then I can modify the metaGEM.sh parser to read in a workload scheduler flag. I will also add the option to submit locally/on the login node.

from metagem.

Lucas-Maciel avatar Lucas-Maciel commented on September 3, 2024

@franciscozorrilla I replaced it with sed but it will still need some modifications regarding the configuration files (how to set memory, time, how many jobs at a time...). I'll try to do that myself

qsub: Negative or zero step in range is not allowed
Error submitting jobscript (exit code 7):

from metagem.

franciscozorrilla avatar franciscozorrilla commented on September 3, 2024

Hi @karinedurand, I have modified the metaGEM.sh script in this commit to take the -l or --local flag for running jobs on a local machine. Simply replace your metaGEM.sh script with the newer version on the master branch and give it a try. You don't have to specify number of jobs, number of cores, memory, or hours, since these parameters are only used by the cluster workload scheduler for submitting jobs. Your jobs will run for as long as they need to (likely for a long time in the case of assemblies for example), they will run in series (i.e. only one at a time), and they will use as much memory as is available on your machine. Regarding the number of cores for the different tasks, this should still be specified in your config.yaml file. For example, you can try running bash metaGEM.sh -t fastp -l. You can check the file nohup.out to see if you tasks are running successfully. I ran some small tests and it seems to be working as expected, but let me know if you run into problems.

@Lucas-Maciel let me know if/when you are able to modify the cluster_config.json/metaGEM.sh files to play nice the the SGE scheduler and I will gladly incorporate your changes. I opened up a new issue (#18) specific to this.

Best wishes,
Francisco

from metagem.

franciscozorrilla avatar franciscozorrilla commented on September 3, 2024

Closing this for now due to inactivity.
Please re-open if any issues arise.

from metagem.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.