Code Monkey home page Code Monkey logo

jamp's Introduction

JAMP logo

JAMP introduction

Just Another Metabarcoding Pipeline - Twitter: @VascoElbrecht

JAMP is modular metabarcoding pipeline, integrating different functions from VSEARCH, CUTADAPT and other programs. The pipeline is run as an R package and automatically generates the needed folders and summary statistics for each processing step, allowing you to trouble shoot and adjust settings as needed. Checking the data and statistics after each processing step is a key element of JAMP and encouraged as it also gives you a better understanding of your data and the bioinformatic process.

End of 2021 update: Currently updating the documentation and older functions for a more streamlined experience 😄 .

For a a short tutorial on extracting haplotypes / ESVs from metabarcoding datasets take a look at the denoising quick guide-Denoising-quick-guide!).

Initialling JAMP

Please keep in mind that JAMP needs Vsearch, and Cutadapt installed to work properly. Thus Mac or linux based systems are recommended (and windows not officially supported, but you can install e.g. a ubuntu shell on your windows system!).

To install JAMP locally

# Installing dependencies needed fro JAMP
install.packages(c("bold", "XML", "seqinr", "devtools", "fastqcr"), dependencies=T)
# Load devtools and install package directly from GitHub
library("devtools")
install_github("VascoElbrecht/PrimerMiner", subdir="PrimerMiner")
install_github("tobiasgf/lulu")
install_github("VascoElbrecht/JAMP", subdir="JAMP")

You can also download the latest release of JAMP, extract and intal within R using install.packages("JAMP", repos = NULL, type="source")

Example of a system wide installation on a ubuntu|debian server:

wget https://github.com/VascoElbrecht/JAMP/archive/v0.53.tar.gz
tar -xzf v0.53.tar.gz
cd JAMP-0.53
sudo R CMD INSTALL JAMP

Licence

JAMP is for non profit and academic use only. If you wish to use any aspects of JAMP commercially, please kindly request permission from Vasco Elbrecht first. Thank you!

jamp's People

Contributors

ondrov avatar tristanlefebure avatar vascoelbrecht avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jamp's Issues

Subsampling only works, if the file contains enough reads

usearch -fastx_subsample "../D_Minmax/_data/GMP-04606)CCDB-S5-0053)CBGMB-00003_cut_minmax.fasta" -fastaout "_data/GMP-04606)CCDB-S5-0053)CBGMB-00003_cut_minmax_N50000.fasta" -sample_size 50000 -sizein -sizeout

Solution, count reads first and then copy the file over if it has fewer reads than should be subsetted

JAMP won't work with up to date cutadapt

Hi Vasco,
JAMP does not seem to work with the latest version of cutadapt.
After downgrading to 2.5 everything works as usual.
Maybe fix this for performance?

Plotting in Cutadapt function breaks the code even if LDist is set to False

Hi Vasco,
in some rare cases the Cutadapt function will run into an infinite xlim error, even though the data handling completed successfully. Got it worked around by commenting out the plotting part in the sourcecode and write a placeholder to the log.
Suggested solution: If LDist is set to False, avoid all plots, not just the length distribution plots. Or even better add another argument to seperate both plotting options.
best Dominik

installation

I am trying to install JAMP in a virtual linux machine and I get this error:

  • installing to library ‘/usr/local/lib/R/site-library’
    ERROR: dependency ‘seqinr’ is not available for package ‘JAMP’
  • removing ‘/usr/local/lib/R/site-library/JAMP’

How can I fix it?

include rarefaction function!

It's better to ratify all samples to same sequencing depth right before clustering!

Just a small detail, but let's try to get this perfect! = )

haplotyping issue with OTU subsetting

Making folders is wrong (?) for some reason

Filtering 5031 reads with min max 217 bp: keep 4270 (84.87%)
Filtering 5635 reads with min max 217 bp: keep 3197 (56.73%)

152 dereplicated files where merged into file:
"_data/3_OTU_clustering/A_all_files_united.fasta"
Total number of sequences (not dereplicated): 534201

United sequences are dereplicated + size filtered into a total of 95693 unique sequences.
File prepared for OTU clustering: B_all_derep.fasta

Clustering reads from
"B_all_derep.fasta" 
otu_radius_pct = 3
strand = plus
Chimeras discarded: 206
OTUs written: 1149 -> file "C_OTUs.fasta"

read renamed! ame as in "B_all_derep.fasta"
Reads remapped!
Subsetting OTUs with 0.1 % anundance; Keeping 13 OTUs
Error in file(file, ifelse(append, "a", "w")) : 
  cannot open the connection
In addition: Warning messages:
1: In dir.create(temp_foldername) :
  cannot create dir '_data/6_haplotypes/_data/5_mapp/K01_116_BR2B_BF24_merged_cut_trunc_minmax_RC_maxee.txt_9.txt', reason 'No such file or directory'
2: In file(file, ifelse(append, "a", "w")) :
  cannot open file '_data/6_haplotypes/_data/5_mapp/K01_116_BR2B_BF24_merged_cut_trunc_minmax_RC_maxee.txt_9.txt/OTU_13/OTU_13_tab.csv': No such file or directory
> 

cluster ESV results!

Cluster ESVs to reduce table size and be better than OTU clustering allone = )

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.