Code Monkey home page Code Monkey logo

vcf-upload-cwl-pipeline's Introduction

VCF Upload cwl pipeline

Overview

VCF upload cwl pipeline is used to automatic VEP and Java annotations and upload an annotated dataset to Anfisa Pod deployed to IBM OpenShift cluster.

In order to upload annotated archive to the IBM bucket and upload dataset to Anfisa Pod "oc cli" and "aws cli" should be installed to annotation server.

Run cwl pipeline

Firstly, needs to activate venv environment

source /data/astorage/venv/bin/activate

Navigate to the folder with the cwl pipeline. There are two options to run the cwl pipeline:

  1. Run cwl pipeline for the dataset archive available by the public url
cwl-runner forome_vcf_upload_uri.cwl inp-job.yml

Specify case_uri and comment lines 5-7 in inp-job.yml input file. Example:

case_name: pgp3140_wgs_rtg1997

case_uri: https://forome-dataset-public.s3.us-south.cloud-object-storage.appdomain.cloud/pgp3140_wgs_rtg1997.tar.gz

#archive:
#  class: File
#  path: pgp3140_wgs_hlpanel.tar.gz
  1. Run cwl pipeline for the dataset archive located on the server. In this case place archive to the same folder with cwl pipeline.
cwl-runner forome_vcf_upload_archive.cwl inp-job.yml

Specify archive and comment line 3 in inp-job.yml input file. Example:

case_name: pgp3140_wgs_rtg1997

#case_uri: https://forome-dataset-public.s3.us-south.cloud-object-storage.appdomain.cloud/pgp3140_wgs_rtg1997.tar.gz

archive:
  class: File
  path: pgp3140_wgs_hlpanel.tar.gz

Note: If dataset contains filename with illegal characters then run cwl-runner with argument --relax-path-checks Example:

cwl-runner --relax-path-checks <cwl workflow> <input file>

CWL paratemeters (inp-job.yml)

Parameter Description Required
case_uri URL to the dataset archive Yes*
case_assembly Look through the inventory file .cfg, specify assembly 37 or 38 Yes
port For assembly 37 specify port 3337, for assembly 38 specify port 5306 Yes
case_name Name of the case Yes
archive Name of the archive with dataset. Archive should be placed to the the same folder with cwl pipeline Yes*
ocProject OpenShift project (namespace) name Yes
ocToken Token of the service account to connect to OpenShift cluster Yes
ocServer OpenShift server Yes
ocPod Anfisa backend Pod name Yes
bucket_name Name of the bucket in the IBM cloud Yes
access_key Storage access key ID Yes
secret_key Storage secret key Yes
region_name Storage region Yes
output_format Is used to build and store credentials for IBM bucket (should be json) Yes
login Login name on the server where cwl pipeline runs Yes
password Password for the login Yes
user_id Specify user ID ( To check user ID run command: id -u <username> ) Yes

*should be specified one of the parameter according to the run option.

vcf-upload-cwl-pipeline's People

Contributors

mbychkovskiy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.