Code Monkey home page Code Monkey logo

awesome-pipeline's Introduction

Awesome Pipeline

A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin

Pipeline frameworks & libraries

  • ActionChain - A workflow system for simple linear success/failure workflows.
  • Airflow - Python-based workflow system created by AirBnb.
  • Anduril - Component-based workflow framework for scientific data analysis.
  • Antha - High-level language for biology.
  • Bds - Scripting language for data pipelines.
  • Bpipe - Tool for running and managing bioinformatics pipelines.
  • Cluster flow - Command-line tool which uses common cluster managers to run bioinformatics pipelines.
  • Compss - Programming model for distributed infrastructures.
  • Conan2 - Light-weight workflow management application.
  • Cosmos - Python library for massively parallel workflows.
  • Cuneiform - Advanced functional workflow language and framework, implemented in Erlang.
  • Doit - Task management & automation tool.
  • Dagobah - Simple DAG-based job scheduler in Python.
  • Drake - Robust DSL akin to Make, implemented in Clojure.
  • Flex - Language agnostic framework for building flexible data science pipelines (Python/Shell/Gnuplot).
  • Flowr - Robust and efficient workflows using a simple language agnostic approach (R package).
  • Joblib - Set of tools to provide lightweight pipelining in Python.
  • Ketrew - Embedded DSL in the OCAML language alongside a client-server management application.
  • Kronos - Workflow assembler for cancer genome analytics and informatics.
  • Loom - Tool for running bioinformatics workflows locally or in the cloud.
  • Longbow - Job proxying tool for biomolecular simulations.
  • Luigi - Python module that helps you build complex pipelines of batch jobs.
  • Makeflow - Workflow engine for executing large complex workflows on clusters.
  • Mario - Scala library for defining data pipelines.
  • Mistral - Python based workflow engine by the Open Stack project.
  • Moa - Lightweight workflows in bioinformatics.
  • Nextflow - Flow-based computational toolkit for reproducibile and scalable bioinformatics pipelines.
  • NiPype - Workflows and interfaces for neuroimaging packages.
  • OpenGE - Accelerated framework for manipulating and interpreting high-throughput sequencing data.
  • PipEngine Ruby based launcher for complex biological pipelines.
  • Pinball - Python based workflow engine by Pinterest.
  • PyFlow - Lightweight parallel task engine.
  • Pwrake - Parallel workflow extension for Rake.
  • Qsubsec - Simple tokenised template system for SGE.
  • Rabix - Python-based workflow toolkit based on the Common Workflow Language and Docker.
  • Remake - Make-like declarative workflows in R.
  • Rmake - Wrapper for the creation of Makefiles, enabling massive parallelization.
  • Rubra - Pipeline system for bioinformatics workflows.
  • Ruffus - Computation Pipeline library for Python.
  • Ruigi - Pipeline tool for R, inspired by Luigi.
  • Sake - Self-documenting build automation tool.
  • Scoop - Scalable Concurrent Operations in Python.
  • Snakemake - Tool for running and managing bioinformatics pipelines.
  • Spiff - Based on the Workflow Patterns initiative and implemented in Python.
  • Stpipe - File processing pipelines as a Python library.
  • Suro - Java-based distributed pipeline from Netflix.
  • Swift - Fast easy parallel scripting - on multicores, clusters, clouds and supercomputers.
  • Toil - Distributed pipeline workflow manager (mostly for genomics).
  • Yap - Extensible parallel framework, written in Python using OpenMPI libraries.
  • WorldMake - Easy Collaborative Reproducible Computing.

Workflow platforms

  • ActivePapers - Computational science made reproducible and publishable.
  • Apache Iravata - Framework for executing and managing computational workflows on distributed computing resources.
  • Arvados - A container based workflow platform.
  • Biokepler - Bioinformatics Scientific Workflow for Distributed Analysis of Large-Scale Biological Data.
  • Chipster - Open source platform for data analysis.
  • Galaxy - Web-based platform for biomedical research.
  • Kepler - Kepler scientific workflow application from University of California.
  • OpenMOLE - Workflow Management System for exploration of models and parameter optimization using distributed computing computing (cluster, grid, cloud).
  • Pegasus - Workflow Management System.
  • Yabi - Online research environment for grid, HPC and cloud computing.
  • Taverna - Domain independent workflow system.
  • VisTrails - Scientific workflow and provenance management system.
  • Wings - Semantic workflow system utilizing Pegasus as execution system.

Workflow languages

Workflow standardization initiatives

Literate programming (aka interactive notebooks)

  • Beaker Notebook-style development environment.
  • IPython A rich architecture for interactive computing.
  • Jupyter Language-agnostic notebook literate programming environment.
  • Pathomx - Interactive data workflows built on Python.
  • Wakari - Web-based Python Data Analysis.
  • Zeppelin - Web-based notebook that enables interactive data analytics.

Build automation tools

  • Bazel - Build software just as engineers do at Google.
  • DoIt - Highly generalized task-management and automation in Python.
  • Gradle - Unified cross platforms builds.
  • Scons - Python library focused on C/C++ builds.
  • Shake - Define robust build systems akin to GNU Make using Haskell.
  • Make - The GNU Make build system.

Other projects

  • HPC Grid Runner
  • noWorkflow - Supporting infrastructure to run scientific experiments without a scientific workflow management system, and still get things like provenance.

awesome-pipeline's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.