Code Monkey home page Code Monkey logo

runjob's Introduction

runjob

OSCS Status PyPI version Downloads install with bioconda

Summary

runjob is a program for managing a group of related jobs running on a compute cluster localhost, Sun Grid Engine, BatchCompute, slurm . It provides a convenient method for specifying dependencies between jobs and the resource requirements for each job (e.g. memory, CPU cores). It monitors the status of the jobs so you can tell when the whole group is done. Litter cpu or memory resource is used in the login compute node.

OSCS

OSCS Status

Software Requirements

python >=3.5

Installation

The development version can be installed with (for recommend)

pip install git+https://github.com/yodeng/runjob.git

The stable release (maybe not latest) can be installed with

pypi:

pip install runjob -U

conda:

conda install -c bioconda runjob

User Guide

All manual can be found here.

Usage

You can get the quick help like this:

runjob/runflow:
$ runjob --help
Usage: runjob [-h] [-v] [-j [<jobfile>]] [-n <int>] [-s <int>] [-e <int>]
              [-w <workdir>] [-d] [-l <file>] [-r <int>] [-f] [-R <int>] [-C]
              [-M {local,localhost,sge,slurm,batchcompute}] [--ini <configfile>]
              [--dag] [--dag-extend] [--strict] [--quiet] [--max-check <float>]
              [--max-submit <float>] [--max-queue-time <float/str>]
              [--max-run-time <float/str>] [--max-wait-time <float/str>]
              [--max-timeout-retry <int>]
              [--local | --localhost | --sge | --slurm | --batchcompute]
              [-i [<str> ...]] [-L <logdir>]

runjob is a tool for managing parallel tasks from a specific job file running
in localhost, sge, slurm, batchcompute.

Optional Arguments:
  -h, --help            show this help message and exit
  --local               submit your jobs to local, same as '--mode local'.
  --localhost           submit your jobs to localhost, same as '--mode localhost'.
  --sge                 submit your jobs to sge, same as '--mode sge'.
  --slurm               submit your jobs to slurm, same as '--mode slurm'.
  --batchcompute        submit your jobs to batchcompute, same as '--mode
                        batchcompute'.
  -i, --injname [<str> ...]
                        job names you need to run. (default: all job names of the
                        jobfile)
  -L, --logdir <logdir>
                        the output log dir. (default: join(workdir, "logs"))

Base Arguments:
  -v, --version         show program's version number and exit
  -j, --jobfile [<jobfile>]
                        input jobfile, if empty, stdin is used. (required)
  -n, --num <int>       the max job number runing at the same time. (default: all
                        of the jobfile, max 1000)
  -s, --start <int>     which line number(1-base) be used for the first job.
                        (default: 1)
  -e, --end <int>       which line number (include) be used for the last job.
                        (default: last line of the jobfile)
  -w, --workdir <workdir>
                        work directory. (default: /home/dengyong/soft/git/runjob)
  -d, --debug           log debug info.
  -l, --log <file>      append log info to file. (default: stdout)
  -r, --retry <int>     retry N times of the error job, 0 or minus means do not
                        re-submit. (default: 0)
  -f, --force           force to submit jobs even already successed.
  -R, --retry-sec <int>
                        retry the error job after N seconds. (default: 2)
  -C, --config          show configurations and exit.
  -M, --mode {local,localhost,sge,slurm,batchcompute}
                        the mode to submit your jobs, if no sge installed, always
                        localhost. (default: sge)
  --ini <configfile>    input configfile for configurations search.
  --dag                 do not execute anything and print the directed acyclic
                        graph of jobs in the dot language.
  --dag-extend          do not execute anything and print the extend directed
                        acyclic graph of jobs in the dot language.
  --strict              use strict to run, means if any errors, clean all jobs and
                        exit.
  --quiet               suppress all output and logging.
  --max-check <float>   maximal number of job status checks per second, fractions
                        allowed. (default: 5)
  --max-submit <float>  maximal number of jobs submited per second, fractions
                        allowed. (default: 20)

Time Control Arguments:
  --max-queue-time <float/str>
                        maximal time (d/h/m/s) between submit and running per job.
                        (default: no-limiting)
  --max-run-time <float/str>
                        maximal time (d/h/m/s) start from running per job.
                        (default: no-limiting)
  --max-wait-time <float/str>
                        maximal time (d/h/m/s) start from submit per job. (default:
                        no-limiting)
  --max-timeout-retry <int>
                        retry N times for the timeout error job, 0 or minus means
                        do not re-submit. (default: 0)
runsge/runshell/runbatch:
$ runsge --help
Usage: runsge [-h] [-v] [-j [<jobfile>]] [-n <int>] [-s <int>] [-e <int>]
              [-w <workdir>] [-d] [-l <file>] [-r <int>] [-f] [-R <int>] [-C]
              [-M {local,localhost,sge,slurm,batchcompute}] [--ini <configfile>]
              [--dag] [--dag-extend] [--strict] [--quiet] [--max-check <float>]
              [--max-submit <float>] [--max-queue-time <float/str>]
              [--max-run-time <float/str>] [--max-wait-time <float/str>]
              [--max-timeout-retry <int>]
              [--local | --localhost | --sge | --slurm | --batchcompute]
              [-N <jobname>] [-L <logdir>] [-g <int>] [--init <cmd>]
              [--call-back <cmd>] [-q [<queue> ...]] [-m <int>] [-c <int>]
              [--out-maping <dir>] [--access-key-id <str>]
              [--access-key-secret <str>]
              [--region {beijing,hangzhou,huhehaote,shanghai,zhangjiakou,chengdu,hongkong,qingdao,shenzhen}]

runsge is a tool for managing parallel tasks from a specific shell file
runing in localhost, sge, slurm, batchcompute.

Optional Arguments:
  -h, --help            show this help message and exit
  --local               submit your jobs to local, same as '--mode local'.
  --localhost           submit your jobs to localhost, same as '--mode localhost'.
  --sge                 submit your jobs to sge, same as '--mode sge'.
  --slurm               submit your jobs to slurm, same as '--mode slurm'.
  --batchcompute        submit your jobs to batchcompute, same as '--mode
                        batchcompute'.
  -N, --jobname <jobname>
                        job name. (default: basename of the jobfile)
  -L, --logdir <logdir>
                        the output log dir. (default:
                        "/home/dengyong/soft/git/runjob/runsge_*_log_dir")
  -g, --groups <int>    N lines to consume a new job group. (default: 1)
  --init <cmd>          command before all jobs, will be running in localhost.
  --call-back <cmd>     command after all jobs finished, will be running in
                        localhost.

Base Arguments:
  -v, --version         show program's version number and exit
  -j, --jobfile [<jobfile>]
                        input jobfile, if empty, stdin is used. (required)
  -n, --num <int>       the max job number runing at the same time. (default: all
                        of the jobfile, max 1000)
  -s, --start <int>     which line number(1-base) be used for the first job.
                        (default: 1)
  -e, --end <int>       which line number (include) be used for the last job.
                        (default: last line of the jobfile)
  -w, --workdir <workdir>
                        work directory. (default: /home/dengyong/soft/git/runjob)
  -d, --debug           log debug info.
  -l, --log <file>      append log info to file. (default: stdout)
  -r, --retry <int>     retry N times of the error job, 0 or minus means do not
                        re-submit. (default: 0)
  -f, --force           force to submit jobs even already successed.
  -R, --retry-sec <int>
                        retry the error job after N seconds. (default: 2)
  -C, --config          show configurations and exit.
  -M, --mode {local,localhost,sge,slurm,batchcompute}
                        the mode to submit your jobs, if no sge installed, always
                        localhost. (default: sge)
  --ini <configfile>    input configfile for configurations search.
  --dag                 do not execute anything and print the directed acyclic
                        graph of jobs in the dot language.
  --dag-extend          do not execute anything and print the extend directed
                        acyclic graph of jobs in the dot language.
  --strict              use strict to run, means if any errors, clean all jobs and
                        exit.
  --quiet               suppress all output and logging.
  --max-check <float>   maximal number of job status checks per second, fractions
                        allowed. (default: 5)
  --max-submit <float>  maximal number of jobs submited per second, fractions
                        allowed. (default: 20)

Time Control Arguments:
  --max-queue-time <float/str>
                        maximal time (d/h/m/s) between submit and running per job.
                        (default: no-limiting)
  --max-run-time <float/str>
                        maximal time (d/h/m/s) start from running per job.
                        (default: no-limiting)
  --max-wait-time <float/str>
                        maximal time (d/h/m/s) start from submit per job. (default:
                        no-limiting)
  --max-timeout-retry <int>
                        retry N times for the timeout error job, 0 or minus means
                        do not re-submit. (default: 0)

Sge/Slurm Arguments:
  -q, --queue [<queue> ...]
                        the queue/partition your job running, multi queue can be
                        sepreated by whitespace. (default: all accessed queue)
  -m, --memory <int>    the memory used per command (GB). (default: 1)
  -c, --cpu <int>       the cpu numbers you job used. (default: 1)

Batchcompute Arguments:
  --out-maping <dir>    the oss output directory if your mode is "batchcompute",
                        all output file will be mapping to you OSS://BUCKET-NAME.
                        if not set, any output will be reserved.
  --access-key-id <str>
                        AccessKeyID while access oss.
  --access-key-secret <str>
                        AccessKeySecret while access oss.
  --region {beijing,hangzhou,huhehaote,shanghai,zhangjiakou,chengdu,hongkong,qingdao,shenzhen}
                        batch compute region. (default: beijing)
qs/qcs:
$ qs --help 
For summary all jobs
Usage: qs [jobfile|logdir|logfile]
       qcs --help
       qslurm

License

runjob is distributed under the MIT license.

Contact

Please send comments, suggestions, bug reports and bug fixes to [email protected].

Example

$ cat test.flow

logs: ./logs
envs:
    samples: A B C

qc:
    echo "qc $samples"
bwa:
    echo "bwa $samples"
    depends: qc.samples   ## each sample-bwa depends each sample-qc
    # depends: qc   ## each sample-bwa depends all sample-qc
sort:
    echo "sort $samples"
    depends: bwa.samples
index:
    echo "index $samples"
    depends: sort.samples
calling:
    echo "calling allsample"
    depends: index
stats:
    echo "stats allsample"
    depends: calling

command runflow -j test.flow --dag | dot -Tsvg > test.svg will get the job graph:

test

Todo

More functions will be improved in the future.

runjob's People

Contributors

yodeng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.