Code Monkey home page Code Monkey logo

gptmd's Introduction

#GPTMD

Software to augment a UniProt XML database with PTMs discovered using Morpheus

##General Overview

G-PTMD is a tool used to expand the scope of peptide identification to include specific post-translational modifications. Currently, identifying peptides with post-translational modifications relies on the variable toggling of modifications on all the residues that can be modified or the documentation of a specific modification in a database. The first method is computationally expensive and wasteful while databases are often incomplete. Thus, G-PTMD addresses the weaknesses of both methods to improve protein idenfication results.

The purpose of G-PTMD is to build a new proteome reference .xml database by annotating an existing reference database using a set of peptide spectral matches (PSMs). The PSMs are obtained by running a Morpheus search on a .raw or .mzml file of the experimentally obtained mass spectrometry data. The program identifies the peptide spectral matches that have a mass shift indicative of certain post-translational modifications. For example, if the peptide “PEPTIDE” was identified in the PSMs file with a mass shift of 79.966 Da, the corresponding protein entry in the reference .xml database would be granted a new feature consisting of a phosphorylated threonine at the appropriate position because 79.966 Da is exactly the change in mass of a peptide when one residue is phosphorylated.

Now, when the original .raw or .mzml file is run against this new database in an open search, if the phosphorylation on the threonine is present in the sample, the computer will be able to recognize and identify it. Phosphorylation is just one example, and this technique extends to other post-translational modifications. The user can add or remove modifications to the list of post-translational modifications as needed. This effectively allows for variable post-translational modifications at targeted positions in the protein leading to better search results upon the second pass without incurring huge data costs associated with blindly adding variable modifications.

##General Requirements

The following files must be present in the folder with the executable. If not, they are automatically downloaded (to update a file to a newer version, delete it, and the application will download a new version). This is the only network usage by the application.

##Operating System Requirements and Usage

###Windows Perl Version ####System ####Usage

###Windows C# Version ####System

###Linux Python Version ####System

  • 8 GB of RAM is recommended
  • python v2.7.10 (64 bit) See https://www.python.org/downloads/ for installation instructions. This includes the "pip" package manager.
  • This program uses lxml, a package for interpreting XML databases. lxml can be installed using the command: pip install lxml. See http://lxml.de for additional installation instructions.
  • If you encounter errors installing lxml, we recommend trying an alternate package manager, such as Canopy, which can be found here.

####Usage

Options:

-h, --help			show this help message and exit

-x REFERENCE_XML, --reference_xml=REFERENCE_XML
					The reference UniProt-XML file.  New PTM features are
					appended to this database to generate the output
					UniProt-XML protein database.
					
-t PTM_DATABASE, --ptm_database=PTM_DATABASE
					Slightly modified database of UniProt PTMs.  This file
					determines which types of PTMs are included.
					
-s PSMS, --psms=PSMS  Peptide spectral matches tab-separated.  This file is
					from first-pass open search and contains the mass
					shifts that correspond to PTMs.
					
-o OUTPUT, --output=OUTPUT
					Output file path.  Outputs a UniProt-XML file.

Example Command Line:

\Dir>bpython gptmd.py -x ../uniprot.xml -t ../sub_ptmlist_regular.txt -s ../PSMs.tsv -o ../test_output.xml

###Relevant Manuscripts

License

The software is currently released under the GNU GPLv3.

Copyright 2016 Lloyd M. Smith Group.

gptmd's People

Contributors

trishorts avatar wurtzl avatar stefanks avatar acesnik avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.