Code Monkey home page Code Monkey logo

happie's Introduction

DOIBuild Status

icon

Happie

Happie is a wrapper for several independent programs used to identify different mobile elements. It is designed to thrive on HPC environments taking advantage of either Docker or Singularity to containerize the different programs.

Happie allows you to identify and extract all the "mobile" regions of a genome assembly -- those identified as being either prophage, plasmid, or genomic island sequence. This can be used to assess the charactaristics of the mobilome as opposed to the whole genome.

In the next few months, we will be uploaded a few case studies showing happie's utility.

click here for a quickstart guide!

Install

Happie can manually be installed from github as follows

git clone https://github.com/nickp60/happie
cd happie
conda create -n happie  biopython pyyaml
source activate happie
python setup.py install
happie_install_or_update

That is going to take care of downloading Docker images for ProphET, prokka, mlplasmids, dimob, mobsuite, PlasFlow, and other goodies. If you are running singularity, a folder called .happie will be created in your home directory.

Running

(Optional) get some test data:

bash ./get_test_data.sh

And try running it!

happie --contigs ./test_data/ecoli104.fna  --output tmpresults

Or, if you are using singularity:

happie --contigs ./test_data/ecoli104.fna  --output tmpresults --virtualization singularity

You should get several output files:

tmp__small/
├── contig_names_key
├── happie_args.yaml
├── intermediate_files
│   ├── ProphET
│   ├── QC
│   ├── cgview.tab
│   ├── dimob
│   ├── mlplasmids
│   ├── mobile_abricate
│   ├── mobile_annofilt
│   ├── mobile_prokka
│   ├── wgs_abricate
│   ├── wgs_annofilt
│   └── wgs_prokka
├── results
│   ├── mobile_abricate.tab
│   ├── mobile_genome_coords
│   ├── mobile_test.fasta
│   ├── mobile_test.gbk
│   ├── mobile_test.gff
│   ├── wgs_abricate.tab
│   ├── wgs_test.gbk
│   └── wgs_test.gff
└── sublogs
    ├── QC.log
    ├── mobile_PropheET.log
    ├── mobile_abricate_resfinder_log.txt
    ├── mobile_abricate_vfdb_log.txt
    ├── mobile_annofilt.log
    ├── mobile_mlplasmids.log
    ├── mobile_prokka.log
    ├── wgs_abricate_resfinder_log.txt
    ├── wgs_abricate_vfdb_log.txt
    ├── wgs_annofilt.log
    └── wgs_prokka.log

Stages

Happie does the following things:

0) QC.

Happie removes short contigs and renames the contig names (to avoid issues with too-long names, non-standatd characters). The names_key file provides a link between the original names and the clean names.

1) Reannotates fasta with Prokka.

This ensures that the annotations are all the same format, which is useful for the pipelines that reference the annotations (like ProphET).

2) Identify Mobile Elements

Plasmids

If working with appropriate organisms, mlplasmids is your best. Otherwise, go with PlasFlow. If you are feeling adventurous, try mob-suite, but use with caution.

Genomic Islands

Originally I was working on incorporating CAFE, but I later found out that it is not designed to handle certain organisms. So, We went with IslandPath-DIMOB.

Prophages

See the short version of our head-to-head comparison here: Testing 3 Prophage Finders.

3) Extract Mobile Elements

4) Assess features of Mobile Elements vs Chromosome

Abricate

VFDB

resfinder

AntiSmash

FAQs

  • Q: This seems like a lot of computational time could be saved by simply referencing the coordinates the mobile regions, rather than extracting and running analyses on the subset. A: you're right! but lots of GWAS tools and other rely on fasta input, so it ends up being easier for downstream work. oom for improvment!
  • Q: My harddrive is full! why would you do this to me? A: most of these tools rely on their own databases, which are all included in the docker images. Theres no way around it -- thats a loooooot of data.
  • Q:

Note

This module was renamed from "mobilephone", it was just too hard to google.

Citing

After each run, happie creates a "citing.txt" file, with links to all the software happie uses. Some of those use additional references, so make sure give credit to everyone involved!

happie's People

Contributors

nickp60 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.