Code Monkey home page Code Monkey logo

mokapot-analyses's Introduction

Code for evaluating mokapot

This repository contains the code for reproducing the results from "mokapot: Fast and flexible semi-supervised learning for peptide detection."

Reproducing the manuscript

The provided can fully reproduce the figures and analyses presented in the manuscript, provided that the necessary software are installed and data are present. Additionally, some analyses (such as the benchmarking experiments) will provide different results depending on the hardware they are run on.

Requirements

Operating System: Our code was written for CentOS7 Linux machines, but should be compatible other Linux distributions as well. The code is unlikely to work on Windows and may need slight changes for MacOS.

Hardware: To most accurately reproduce our results, a 12-core machine with a minimum of 32 Gb of memory should be used.

Installed Software: The analysis scripts are written for Python 3.7+. Many of the software tools are installed automatically when executing the analysis, however some need to be installed beforehand:

You can then check that these have been successfully installed and configured with the following commands (the example output is from my machine, but yours may be slightly different).

Verify Python Version and configured:

$ python3 --version
Python 3.8.6

Verify conda is installed and configured:

$ conda --version
conda 4.9.2

Verify Crux is installed and configured:

$ crux version
INFO: Beginning version.
====================
Crux version 3.2-9d35092f
====================
Proteowizard version 3.0.20213
====================
Percolator version 3.05.nightly-1-e16f49a-dirty, Build Date Jul 30 2020 22:14:27
Copyright (c) 2006-9 University of Washington. All rights reserved.
Written by Lukas Käll ([email protected]) in the
Department of Genome Sciences at the University of Washington.
====================
Comet version 2019.0X rev. X
====================
Boost version 1_67
====================
INFO: Elapsed time: 0.000323 s
INFO: Finished crux version.
INFO: Return Code:0

Verify that MSFragger is installed and configured (your path may be different):

$java -jar ~/bin/MSFragger-3.1.1/MSFragger-3.1.1.jar --version
MSFragger version MSFragger-3.1.1
Batmass-IO version 1.19.5
timsdata library version timsdata-2-7-0
(c) University of Michigan
RawFileReader reading tool. Copyright (c) 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
System OS: Linux, Architecture: amd64
Java Info: 14.0.2, OpenJDK 64-Bit Server VM, Red Hat, Inc.

Finally, you'll need to specify the path to the MSFragger jar file:

export MSFRAGGER=~/bin/MSFragger-3.1.1/MSFragger-3.1.1.jar

Running the Analyses

Running the analyses is easy, but will potentially take days. First, we can use GNU make to install the prerequisite packages into a new conda environment:

$ make install && conda activate mokapot

Then the analyses can be run simply with:

$ make

Results

Once complete, all of the figures will be present in the figures directory.

Questions?

If you have problems or questions, feel free to ask Will Fondrie ([email protected]).

mokapot-analyses's People

Contributors

wfondrie avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.