Code Monkey home page Code Monkey logo

mg_annotation's People

Contributors

aclum avatar chienchi avatar hubin-keio avatar kaijli avatar kltm avatar mflynn-lanl avatar scanon avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mg_annotation's Issues

Update to new version of img-nr database

Deliverable this task is associated with

1

RACI

Responsible: @Michal-Babins
Accountable: @aclum
Consulted: @hubin-keio
Informed:

Describe the task?

  • Update the version of the img-nr database for the annotation pipeline. This will be provided by Marcel and should be copied to /global/cfs/cdirs/m3408/refdata/img/IMG-NR/YYYYMMDD
  • update functional-annotation.wdl to use the new version, input variables are variables are
  • update to lastal 1456
ko_ec_img_nr_db
ko_ec_md5_mapping
ko_ec_taxon_to_phylo_mapping

Criteria for completion

A test annotation workflow run completes successfully and the imgap_version file has a img-nr database version has a year matching 2023

Completion Date (Goal)

July 7
Target Sprint Start & End Dates

Start: July 3
End: July 14
Tag Blocker/Contingent upon issues
Dependent on new database version release from @mhuntemann

example inputs.json is not valid

Example inputs.json in both the master and werkflow branch is not valid.
The first issue is that there is an extra closing bracket so the file is not valid json.
The second issue is that 'annotation.imgap_input_fasta' is not valid for the annotation.wdl, it should be 'annotation.input_file' based on the wdl or vice versa. Either the inputs.json or the wdl needs to be updated.

add scaffold lineage file to wdl outputs

Marcel next week should be providing an updated version that outputs a lineage file. The wdl workflow should be updated to have this as an output and the runtime automation should make a data object for this file. A file enum,Lineage sdb, has been made already.

Blocked on updates from Marcel and runtime API being updated to schema version 7.7

add support for Contig Mapping File

There are a few things we need to confirm are working properly so we can support NMDC workflows that start with an input assembly file. That is, workflows that start with a contigs file, instead of reads.
In this case we need to

  • execute pre_qc_execute from a customized inputs.json generated by workflow automation
  • make sure that mapping file is output from the workflow in the wdl task and workflow outputs block
  • make sure the output renamed contig file is the input to the other wdl structural_annotation tasks.

see related task on the automation side microbiomedata/nmdc_automation#35

cc @mhuntemann @scanon

issue with crt.wdl

The merge of wdl 1.0 duplicated some output values in crt.wdl of the run task.

see minwdl linting
miniwdl check ./annotation_full.wdl
(./annotation_full.wdl Ln 3 Col 1) Failed to import ./structural-annotation.wdl
(./structural-annotation.wdl Ln 5 Col 1) Failed to import crt.wdl
(crt.wdl Ln 48 Col 5) Multiple declarations of crisprs
File crisprs = "{prefix}_crt.crisprs"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(crt.wdl Ln 49 Col 5) Multiple declarations of gff
File gff = "
{prefix}_crt.gff"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(crt.wdl Ln 50 Col 5) Multiple declarations of crt_out
File crt_out = "~{prefix}_crt.out"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

img annotation 5.1.14 updates

Make the following updates to the dockerfile, build new docker file, update wdls to use new docker image

update to Marcel's patched version of prodigal
update to CRT-CLI_v1.8.4
update to pulling from public repo instead of nersc portal for IMG wrapper scripts.

update product mapping files

We missed updating to the new product mapping tables when we updated to img annotation version 5.2.
Update the workflow to use /global/cfs/cdirs/m3408/refdata/img/Product_Name_Mappings/20230814

This is what Marcel is now using.

cc @hubin-keio

copy output files

The edge-nmdc website needs to display the output files for the annotation workflow. Currently, the output files for the mg-annotation workflow remain within the cromwell-execution directory after each task is completed. A task is needed to copy files from the cromwell-execution directory to a directory in edge_app/nmdc-edge/projects. This includes adding outdir as an optional input to the wdl and inputs.json files

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.