Code Monkey home page Code Monkey logo

dartseqmet's Introduction

License: GPL v3

Computational protocol for the analysis of DArTseqMet data.

This repository contains the source code necessary to analyze DArTseqMet data, a restriction enzyme genome reduction technique capable of identifying the DNA methylations in a sample on a genome-wide scale.

The method shown here is described in the paper "A cost-effective approach to DNA methylation detection by Methyl Sensitive DArT sequencing.". Later, the same approach was used to investigate DNA methylation in clones of Eucalyptus grandis grown in contrasting environments, as described in the paper "Patterns of DNA methylation changes in elite Eucalyptus clones across contrasting environments".

Installation

This computational protocol is designed to be executed using the Snakemake workflow management system.

Step wise installation

A step-by-step installation of the major software components is given below.

The recommended method for installing Snakemake is by using Conda/Mamba, as shown below:

conda install -n base -c conda-forge mamba

Conda/mamba allows you to create different environments containing files, packages, and their dependencies that will not interact with other environments. Therefore, creating a new environment to contain the dependencies to execute this workflow is advantageous. For more information about conda and conda environments, please visit: https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html

Here, we create an environment named DArTseqMet while also installing Snakemake within it. Since the workflow uses bowtie2, which depends on python2.7, we need to create a second environment that can be used by Snakemake to avoid conflicts between software that relies on different versions of Python.

conda activate base
mamba create -c conda-forge -c bioconda -n DArTseqMet snakemake r-base -y

# Here, we create a new environment for bowtie two, then export it as a yaml file that can be called in the rules of the workflow that depend on it.
mamba create -c conda-forge -c bioconda -n bowtie2_env python=2.7 bowtie2

conda activate bowtie2_env
conda env export > bowtie2.yaml
conda deactivate

Next, we activate the DArTseqMet and install other necessary software for the workflow.

conda activate DArTseqMet

## Installing Trimmomatic from the bioconda channel
conda install -c bioconda trimmomatic
## Installing samtools from the bioconda channel
conda install -c bioconda samtools
## Installing bedtools from the bioconda channel
conda install -c bioconda bedtools
## Installing subread from the bioconda channel
mamba install -c bioconda subread
## Installing fastqc from the bioconda channel
conda install -c bioconda fastqc

We also need to install some R packages.

mamba install -c conda-forge r-docopt r-tidyverse r-data.table r-gdata r-gridextra  r-essentials 

Unfortunately, mamba does not work when installing Bioconductor packages. Therefore, we install the necessary Bioconductor packages directly in R using the command line below.

R -e "install.packages('BiocManager', repos='http://cran.us.r-project.org'); library('BiocManager'); BiocManager::install('DESeq2'); BiocManager::install('biostrings'); BiocManager::install('edgeR'); BiocManager::install('VennDiagram')"

Executing the analysis

Adjusting the config.yaml file.

Users need to adjust the config.yaml file to inform samples names and other parameters required for the analyses. Note that some files are required to be in a specific format. Files format and other restrictions are listed on the file config.yaml.

Executing the workflow

After adjusting the config.yaml file, the execution of the workflow is as simple as running one command line.

snakemake -p -c 7 --use-conda all

Note that some parameters are required:

  • "-c" control the number of cores to be used. This value must be the same as informed in the configuration file (config.yaml)
  • "--use-conda" allows the workflow to take advantage of conda to build the environment for bowtie 2.

License

Distributed under the GNU General Public License v3.0. See LICENSE for more information.

dartseqmet's People

Contributors

wendelljpereira avatar gpappasunb avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.