Code Monkey home page Code Monkey logo

mergem's Introduction

mergem

mergem is a python package and command-line tool for merging, comparing, and translating genome-scale metabolic models.


Installation

Use pip to install the latest release:

pip install mergem

Usage

For detailed usage instructions, please refer to the documentation.

Command-line usage

Command-line options can be viewed using "--help" flag, as shown below:

> mergem --help
Usage: mergem [OPTIONS] [INPUT_FILENAMES]...

mergem takes genome-scale metabolic models as input, merges them into a single model and saves the merged model as .xml. Users can optionally select the objective, provide an output filename for the merged model, and translate the models to a different namespace.

Lobo Lab (https://lobolab.umbc.edu)

Options:
-obj TEXT  Set objective: 'merge' all objectives (default) or 1, 2, 3... (objective from one of the input models)  
-o TEXT    Save model as (filename with format .xml, .sbml, etc.)  
-v         Print merging statistics
-up        Update ID mapping table
-s         Save ID mapping table as CSV
-e         Uses exact stoichiometry when merging reactions
-p         Consider protonation when merging reactions
-a         Extend annotations with mergem database of metabolites and reactions
-t TEXT    Translate all metabolite and reaction IDs to a target namespace (chebi, metacyc, kegg, reactome, metanetx, hmdb, biocyc, bigg, seed, sabiork, or rhea)
--version  Show the version and exit.
--help     Show this message and exit.

For merging two models and setting objective of merged model from first model, use:

mergem -i model1.xml -i model2.xml -obj 1

To print merging statistics, append the "-v" flag:

mergem -i model1.xml -i model2.xml -obj 1 -v 

Python usage

To use mergem within a python script, simply import the package with:

import mergem

For merging or processing one, two, or more models, provide a list of models to the merge function:

results = mergem.merge(input_models, set_objective='merge', exact_sto=False use_prot=False, extend_annot=False, trans_to_db=None)
merged_model = results['merged_model']
jacc_matrix = results['jacc_matrix']
num_met_merged = results['num_met_merged']
num_reac_merged = results['num_reac_merged']
met_sources = results['met_sources']
reac_sources = results['reac_sources']
  • input_models is a list of one or more COBRApy model objects or strings specifying file names.

  • set_objective specifies if the objective functions are merged ('merge') or copied from a single model (specifying the index of the model: '1', 2', '3', etc.).

  • exact_sto use exact stoichiometry when merging reactions.

  • use_prot consider hydrogen and proton metabolites when merging reactions.

  • add_annot add additional metabolite and reaction annotations from mergem dictionaries.

  • trans_to_db translate metabolite and reaction IDs to a target database (chebi, metacyc, kegg, reactome, metanetx, hmdb, biocyc, bigg, seed, sabiork, or rhea)

  • results a dictionary with all the results, including:

  • merged_model the merged model.

  • jacc_matrix metabolite and reaction jaccard distances.

  • num_met_merged number of metabolites merged.

  • num_reac_merged number of reactions merged.

  • met_sources dictionary mapping each metabolite ID in the merged model to the corresponding metabolite IDs from each of the input models.

  • reac_sources dictionary mapping each reaction ID in the merged model to the corresponding reaction IDs from each of the input models.

The merge function returns a dictionary of results including the merged model, the metabolite and reaction Jaccard distance matrix between models, and the metabolite and reaction model sources.

The following functions can also be imported from mergem:

from mergem import translate, load_model, save_model, map_localization, map_metabolite_univ_id, map_reaction_univ_id, get_metabolite_properties, get_reaction_properties, update_id_mapper
  • translate(input_model, trans_to_db) translates a model to another target database specified in trans_to_db.
  • load_model(filename) loads a model from the given filename/path.
  • save_model(cobra_model, file_name) takes a cobra model as input and exports it as file file_name.
  • map_localization(id_or_model_localization) converts localization suffixes into common notation.
  • map_metabolite_univ_id(met_id) maps metabolite id to metabolite universal id.
  • map_reaction_univ_id(reac_id) maps reaction id to metabolite universal id.
  • get_metabolite_properties(met_univ_id) retrieves the properties of a metabolite using its universal id
  • get_reaction_properties(reac_univ_id) retrieves the properties of a reaction using its universal id
  • update_id_mapper(delete_database_files) updates and build mergem database. It will download the latest source database files, merge the identifiers based on common properties, and save the mapping mapping tables and information internally. This process can take several hours. The parameter specifies if the downloaded intermediate database files are deleted after the update (saves disk space but the next update will take longer; dafault is True).

Citation

Please cite mergem using:

mergem: merging, comparing, and translating genome-scale metabolic models using universal identifiers
A. Hari, A. Zarrabi, D. Lobo
NAR Genomics and Bioinformatics, 6(1), lqae010, 2024


Acknowledgements

This package was developed at The Lobo Lab, University of Maryland, Baltimore County.


License

This package is under GNU GENERAL PUBLIC LICENSE. The package is free for use without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, subject to the following restrictions:

  1. The origin of this software and database must not be misrepresented; you must not claim that you wrote the original software.
  2. If you use this software and/or database in a work (any production in the scientific, literary, and artistic domain), an acknowledgment and citation (see publication above) in the work is required.
  3. This notice may not be removed or altered from any distribution.

mergem's People

Contributors

archh1 avatar arveenz avatar dnlobo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

mergem's Issues

cobra version

Hi there and thanks for the great resource!

It is my belief that you need to change in your requirements.txt the cobra version.

You are using the cobra.core.Gene.GPR class and I think it was first introduced in cobra==0.24.0.

Here is the initial commit of the class.

mergem translate tool does not transfer new ID's to model

Hello! I am trying to translate a modelSEED model to a model that uses bigg ID's, I have tried using the command line and python script.

import mergem
from mergem import translate, load_model, save_model, map_localization, map_metabolite_univ_id, map_reaction_univ_id, get_metabolite_properties, get_reaction_properties, update_id_mapper
   
translated_model = mergem.translate(model, 'bigg')

and:

import mergem
from mergem import translate, load_model, save_model, map_localization, map_metabolite_univ_id, map_reaction_univ_id, get_metabolite_properties, get_reaction_properties, update_id_mapper
    
translated_model = mergem.translate(model, trans_to_db='bigg')

Neither versions return a model object with translated model ID's.

On command line, it looks like the translations are taking place, but the final model that is saved as "translated" does not contain bigg ID's.

(cobrapy_env) nfarooqi@M4819 gsm % mergem lhelveticus.sbml -t bigg
mergem, v0.26.2
Model does not contain SBML fbc package information.
SBML package 'layout' not supported by cobrapy, information is not parsed
SBML package 'render' not supported by cobrapy, information is not parsed
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00443_c0 "ABEE_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd02920_c0 "2_Amino_4_hydroxy_6_hydroxymethyl_7_8_dihydropteridinediphosphate_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00012_c0 "PPi_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00067_c0 "H_plus__c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00683_c0 "Dihydropteroate_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00114_c0 "IMP_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00103_c0 "PRPP_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00226_c0 "HYXN_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00001_c0 "H2O_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd01759_c0 "N_Acyl_L_aspartate_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00041_c0 "L_Aspartate_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00049_c0 "Carboxylic_acid_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00002_c0 "ATP_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00008_c0 "ADP_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00009_c0 "Phosphate_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00947_c0 "LacCer_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00108_c0 "Galactose_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00878_c0 "Glucocerebroside_c0">
Use of the species charge attribute is discouraged, use fbc:charge instead: <Species M_cpd00022_c0 "Acetyl_CoA_c0">
cont....

Any ideas on how to resolve thi? Or perhaps I'm using the tool incorrectly?

IndexError: list index out of range

Hi, I'm trying to merge about 200 models into one
image

First I tried with only 2 and it raised this error:

merge_results = mergem.merge(["test/Abiotrophia_defectiva_ATCC_49176.xml","test/Acidaminococcus_fermentans_DSM_20731.xml"], 1)


IndexError Traceback (most recent call last)
/tmp/ipykernel_20689/2630019269.py in
----> 1 merge_results = mergem.merge(["test/Abiotrophia_defectiva_ATCC_49176.xml", "test/Acidaminococcus_fermentans_DSM_20731.xml"], 1)

~/anaconda3/lib/python3.9/site-packages/mergem/__mergeModels.py in merge(input_models, set_objective)
54 merged_model_metabolites.append(metabolite)
55 old_met_id = metabolite.id
---> 56 new_met_id = map_metabolite_to_mergem_id(metabolite)
57
58 if (new_met_id is None) or (new_met_id in met_sources_dict):

~/anaconda3/lib/python3.9/site-packages/mergem/__mergeModels.py in map_metabolite_to_mergem_id(metabolite)
234 else:
235 split = met_id.rsplit("_", 1)
--> 236 met_compartment = __modelHandling.map_localization(split[1])
237 if met_compartment == '':
238 met_compartment = split[1]

IndexError: list index out of range

==========================================
Some search said this error with the models itself, is there anway to fix this with all models?

Enable only mapping of IDs

Hey, this looks great!
The mapping you do here seems very thorough, but would it be possible to map the namespace of one model to a different namespace without actually merging models?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.