Code Monkey home page Code Monkey logo

miniwdl's Introduction

miniwdl

Workflow Description Language static analysis toolkit for Python 3.6+

Project Status PyPI version Anaconda-Server Badge MIT license Code style: black Build Status Coverage Status Docs Status

miniwdl is a library for parsing WDL documents into a type-checked abstract syntax tree (AST), providing a foundation for new runtime systems, developer tooling, and language experimentation. It also includes command-line tools supporting the WDL development cycle, including a "linter" to statically analyze WDL documents for errors and oversights, and a Cromwell wrapper to make it more convenient to test a workflow locally.

This project in alpha development; interfaces are liable to change somewhat. See the Releases for change logs. The Project board reflects the near-term roadmap.

Installation

pip: pip3 install miniwdl

conda: configure conda-forge and conda install miniwdl

Command-line tools

miniwdl check

miniwdl check /path/to/workflow.wdl loads the WDL document and shows a brief outline with any lint warnings. Add --path /path/to/tasks/ with a directory to search for imported documents (one or more times). Example with HumanCellAtlas/skylab:

$ git clone https://github.com/HumanCellAtlas/skylab.git
$ miniwdl check --path skylab/library/tasks/ \
    skylab/pipelines/smartseq2_single_sample/SmartSeq2SingleSample.wdl 

SmartSeq2SingleSample.wdl
    workflow SmartSeq2SingleCell
        (Ln 14, Col 8) UnusedDeclaration, nothing references File gtf_file
        call HISAT2.HISAT2PairedEnd
        call Picard.CollectMultipleMetrics
        call Picard.CollectRnaMetrics
        call Picard.CollectDuplicationMetrics
        call HISAT2.HISAT2RSEM as HISAT2Transcriptome
        call RSEM.RSEMExpression
        call GroupQCs.GroupQCOutputs
        call ZarrUtils.SmartSeq2ZarrConversion
    GroupQCs : GroupMetricsOutputs.wdl
        task GroupQCOutputs
            (Ln 10, Col 10) StringCoercion, String mem = :Int:
            (Ln 11, Col 10) StringCoercion, String cpu = :Int:
            (Ln 12, Col 10) StringCoercion, String disk_space = :Int:
    HISAT2 : HISAT2.wdl
        task HISAT2PairedEnd
        task HISAT2RSEM
        task HISAT2InspectIndex (not called)
        task HISAT2SingleEnd (not called)
    Picard : Picard.wdl
        task CollectDuplicationMetrics
        task CollectMultipleMetrics
        task CollectRnaMetrics
    RSEM : RSEM.wdl
        task RSEMExpression
    ZarrUtils : ZarrUtils.wdl
        task SmartSeq2ZarrConversion
            (Ln 36, Col 6) CommandShellCheck, SC2006 Use $(..) instead of legacy `..`.
            (Ln 39, Col 9) CommandShellCheck, SC2006 Use $(..) instead of legacy `..`.
            (Ln 39, Col 15) CommandShellCheck, SC2086 Double quote to prevent globbing and word splitting.
            (Ln 40, Col 10) CommandShellCheck, SC2086 Double quote to prevent globbing and word splitting.
            (Ln 40, Col 21) CommandShellCheck, SC2086 Double quote to prevent globbing and word splitting.

In addition to its suite of WDL-specific warnings, miniwdl check uses ShellCheck, if available, to detect possible issues in each task command script. You may need to install ShellCheck separately, as it's not included with miniwdl.

If you haven't installed the PyPI package to get the miniwdl entry point, equivalently PYTHONPATH=$PYTHONPATH:/path/to/miniwdl python3 -m WDL check ....

miniwdl cromwell

This tool provides a nicer command-line interface for running a workflow locally using Cromwell. Example:

$ cat << 'EOF' > hello.wdl
version 1.0
task hello {
    input {
        Array[String]+ who
        Int x = 0
    }
    command <<<
        awk '{print "Hello", $0}' "~{write_lines(who)}"
    >>>
    output {
        Array[String]+ messages = read_lines(stdout())
        Int meaning_of_life = x+1
    }
}
EOF
$ miniwdl cromwell hello.wdl
missing required inputs for hello: who
required inputs:
  Array[String]+ who
optional inputs:
  Int x
outputs:
  Array[String]+ messages
  Int meaning_of_life
$ miniwdl cromwell hello.wdl who=Alyssa "who=Ben Bitdiddle" x=41
{
  "outputs": {
    "hello.messages": [
      "Hello Alyssa",
      "Hello Ben Bitdiddle"
    ],
    "hello.meaning_of_life": 42
  },
  "id": "b75f3449-344f-45ec-86b2-c004a3adc289",
  "dir": "/home/user/20190203_215657_hello"
}

By first analyzing the WDL code, this tool translates the freeform command-line arguments into appropriately-typed JSON inputs for Cromwell. It downloads the Cromwell JAR file automatically to a temporary location; a compatible java JRE must be available to run it. The outputs and logs are written to a new date/time-named subdirectory of the current working directory (overridable; see --help).

The tool supports shell tab-completion for the workflow's available input names. To use this, enable argcomplete global completion by invoking activate-global-python-argcomplete and starting a new shell session. Then, start a command line miniwdl cromwell hello.wdl and try double-tab.

WDL Python library

The WDL package provides programmatic access to the WDL parser and AST. The following example prints all declarations in a workflow, descending into scatter and if stanzas as needed.

$ python3 -c "
import WDL

doc = WDL.load('skylab/pipelines/optimus/Optimus.wdl',
               path=['skylab/library/tasks/'])

def show(elements):
  for elt in elements:
    if isinstance(elt, WDL.Decl):
      print(str(elt.type) + ' ' + elt.name)
    elif isinstance(elt, WDL.Scatter) or isinstance(elt, WDL.Conditional):
      show(elt.elements)
show(doc.workflow.elements)
"

String version
Array[File] r1_fastq
Array[File] r2_fastq
Array[File] i1_fastq
String sample_id
File tar_star_reference
File annotations_gtf
File ref_genome_fasta
File whitelist
String fastq_suffix
Array[Int] indices
Array[File] non_optional_i1_fastq
File barcoded_bam

API documentation

Online Python developer documentation for the WDL package: Docs Status

(Read the Docs currently builds from the mlin/miniwdl fork of this repository.)

Locally, make doc triggers Sphinx to generate the docs under docs/_build/html/. Or, after building the docker image, copy them out with docker run --rm -v ~/Desktop:/io miniwdl cp -r /miniwdl/docs/_build/html /io/miniwdl_docs.

Contributing

Feedback and contributions are welcome on this repository. Please:

  1. Add appropriate tests to the automatic suite
  2. Use make pretty to reformat the code with black
  3. Ensure compatibility with this project's MIT license
  4. Send pull requests from a dedicated branch without unrelated edits

The Project board is our up-to-date tracker.

To set up your local development environment,

  1. git clone --recursive this repository
  2. Install dependencies as illustrated in the Dockerfile (OS packages + PyPI packages listed in requirements.txt and requirements.dev.txt)
  3. Invoking user must be able to access local Docker daemon.

The Makefile has a few typical scripted flows:

  • make pretty reformats the code with black
  • make check validates the code with Pylint and Pyre
  • make or make test runs the full test suite with code coverage report (takes several minutes)
  • make qtest runs most of the tests more quickly (by omitting some slower cases, and not tracking coverage)

Security

Please disclose security issues responsibly by contacting [email protected].

miniwdl's People

Contributors

cmarkello avatar jtratner avatar mckinsel avatar mlin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.