Code Monkey home page Code Monkey logo

sysbiochalmers / raven Goto Github PK

View Code? Open in Web Editor NEW
98.0 14.0 52.0 1.01 GB

The RAVEN Toolbox for genome scale model reconstruction, curation and analysis.

Home Page: http://sysbiochalmers.github.io/RAVEN/

License: Other

MATLAB 80.00% Perl 7.06% Shell 8.83% Roff 3.27% PHP 0.34% Smarty 0.45% CSS 0.05%
genome-scale-models constraint-based-modeling metabolic-reconstruction gap-filling human-metabolism metabolic-models metabolic-engineering systems-biology sysbio

raven's Introduction

DOI GitHub release Tests passing Join the chat at https://gitter.im/SysBioChalmers/RAVEN View RAVEN Toolbox on File Exchange

The RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks) Toolbox 2 is a software suite for Matlab that allows for semi-automated reconstruction of genome-scale models (GEMs). It makes use of published models and/or KEGG, MetaCyc databases, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology.

Documentation

The information about downloading, installing and developing RAVEN is included in the Wiki. The source code documentation is also available online.

Cite Us

If you use RAVEN 2 in your scientific work, please cite:

Wang H, Marcišauskas S, Sánchez BJ, Domenzain I, Hermansson D, Agren R, Nielsen J, Kerkhoven EJ. (2018) RAVEN 2.0: A versatile toolbox for metabolic network reconstruction and a case study on Streptomyces coelicolor. PLoS Comput Biol 14(10): e1006541. doi:10.1371/journal.pcbi.1006541.

Starting with RAVEN v2.3.1, all the releases are also archived in Zenodo, for you to cite the specific version of RAVEN that you used in your study

If you use ftINIT in your scientific work, please cite:

Gustafsson J, Anton M, Roshanzamir F, Jörnsten R, Kerkhoven EJ, Robinson JL, Nielsen J. (2023) Generation and analysis of context-specific genome-scale metabolic models derived from single-cell RNA-Seq data. Proc Natl Acad Sci 120(6): e2217868120. doi:10.1073/pnas.2217868120

For crediting supporting work, please cite doi:10.1002/msb.145122 (tInit); doi:10.1371/journal.pcbi.1000859 (randomsampling). For crediting RAVEN 1, cite doi:10.1371/journal.pcbi.1002980. For more details, see wiki#cite-us.

Contact Us

For support, technical issues, bug reports etc., please Join the chat at https://gitter.im/SysBioChalmers/RAVEN. For other issues, please contact Eduard Kerkhoven.

More from SysBio Chalmers

For more systems biology related software and recently published genome-scale models from the Systems and Synthetic Biology group at Chalmers University of Technology, please visit the GitHub page. For more information and publications by the Systems and Synthetic Biology please visit SysBio.

raven's People

Contributors

ae-tafur avatar benjasanchez avatar ckitti avatar danieljcook avatar edkerk avatar gitter-badger avatar haowang-bioinfo avatar ivandomenzain avatar johan-gson avatar jonathanrob avatar mihai-sysbio avatar rasmusagren avatar simas232 avatar sysmedicine avatar tpfau avatar varemo avatar zhxiaokang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

raven's Issues

Implement open-source solver software

The RAVEN toolbox should be compatible with at least one an open-source solver software. This would enable automatic installation of the toolbox.

format of model.rxnConfidenceScores

Cobra specifies that model.rxnConfidenceScores should be a numeric array. This makes sense.

For Raven, ravenCobraWrapper converts this to a cell array (of doubles), while checkModelStruct suggest that model.rxnConfidenceScores should be an array of strings.

Is there an explicit reason why Raven specifies that rxnConfidenceScores should be a cell array? If so, should it then be numeric or strings? If no explicit reason, then follow Cobra specification.

feat: I/O of SBO terms from SBML file

Description of the issue:

  • RAVEN is not able to I/O SBO terms that are specified in SBML files.

Reproducing this issue:

SBO terms are included in the SBML file as following (similar for metabolites and reactions):

<species boundaryCondition="false" constant="false" ...
   hasOnlySubstanceUnits="false" id="M_s_0001" name="(1-3)-beta-D-glucan" ...
   metaid="M_s_0001" sboTerm="SBO:0000247" compartment="ce" ...
   fbc:charge="0" fbc:chemicalFormula="C6H10O5">

When loaded in Matlab using TranslateSBML, SBO terms are easily accessible at (similar for metabolites and reactions):

modelSBML.species(:).sboTerm

Parsing this in exportModel and importModel should be relatively straight forward, and should be stored in the metMiriams and rxnMiriams structures.

Note that ravenCobraWrapper does already have the option to parse SBO terms from the MIRIAMS structures to their COBRA counterpart fields (metSBOTerms).

System information

  1. RAVEN version: all
  2. Operating system: all

I hereby confirm that I have:

ravenToCobraModel.m

Hi there,

could be that in line 29
fields = rmfield(rModel,fieldnames(setdiff(fieldnames(rModel), f.optionalEquiv)));
should be
fields = rmfield(rModel,(setdiff(fieldnames(rModel), f.optionalEquiv)));
instead?

Flux direction of mean flux, in random sampling

Running the randomSampling.m script on a model with all reaction lower boundaries set to 0 and with all reactions non-reversible still rendered negative mean fluxes. "Muting" lines 59-62 in the script seems to solve the problem completely.

Excel I/O problems using MATLAB 2017b and later

Edit: a workaround for this problem is provided, see: #55 (comment)

Attempting to import an Excel model (.xlsx) using the command "importExcelModel" results in the following error:

Error using importExcelModel (line 140)
Java exception occurred:
java.lang.NoClassDefFoundError: org/apache/commons/collections4/ListValuedMap

at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:181)

at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:140)

Caused by: java.lang.ClassNotFoundException:
org.apache.commons.collections4.ListValuedMap

at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

... 2 more

There is no problem importing the same model saved as a .xls spreadsheet.

getModelFromMetaCyc makes grRules with inconsistent data format

Fixes:

  • grRules as cell array of strings (see this comment)
  • consistent use of ()-brackets in grRules (see 2 comments below)

After running getModelFromMetaCyc, the resulting model.grRules uses inconsistent data formats.

grRules should be "Column Cell Array of Strings", but the current resulting cell array is a combination of strings and cells:

>> metaCycModel = getModelFromMetaCyc();
>> metaCycModel.grRules(15:17)

ans =

  3×1 cell array

    '(MONOMER-15937)'
    {1×1 cell}
    ''

This should all be formatted as string (metaCycModel.grRules(15) is correct). Incorrect formatting makes other functions fail when using the metaCycModel structure.

Annotation problems + allow multiple subSystems

RAVEN has two problems related to the annotation:

  1. When importing yeast consensus model, the annotation is lost for reactions.
  2. RAVEN doesn't function when specifying multiple subSystems for each reaction as cell array. See: SysBioChalmers/yeast-GEM@03153ce.
  • importModel needs to be updated
  • exportModel needs to be updated
  • ravenCobraWrapper needs to be updated [@edkerk: no need, subSystems is identical in both model structures]
  • decide how subSystems are represented in Excel, as they need to be concatenated into one cell per reaction. Separated by ';'
  • exportToExcelFormat should be updated
  • exportToTabDelimited should be updated
  • confirm that functions modifying reactions can deal with cell arrays of subSystems, instead of strings
  • for the future update of KEGG mat files, we need to make sure that multiple subsystems are correctly imported. Many KEGG reactions belong to several subsystems and until now only the first mentioned subsystem was kept

not all KEGG metabolites have their compound id as metMiriam

When running getModelFromKEGG(), not all metabolites have their KEGG compound ID as annotation in the metMiriams field. As example:

>> keggModel = getModelFromKEGG();
>> keggModel.metNames{393}

ans =

    '2,5-Dioxopentanoate'

>> keggModel.mets{393}

ans =

    'C00433'

>> keggModel.metMiriams{393}

ans = 

  struct with fields:

     name: {'chebi'}
    value: {'CHEBI:17415'}

Even though it has a valid compound ID as metabolite ID, only CHEBI is included.

This is not the case for all metabolites:

>> keggModel.metNames{394}

ans =

    'Oxidized rubredoxin'

>> keggModel.mets{394}

ans =

    'C00435'

>> keggModel.metMiriams{394}

ans = 

  struct with fields:

     name: {'kegg.compound'}
    value: {'C00435'}

Make RAVEN into a Matlab toolbox

A main drawback with Matlab compared to, say, Python or R is that there is no convenient way to package and install code. Matlab still depends on manually handling a list of search paths, which feels ancient compared to the advanced package managers such as pip or conda used by its competitors.

Mathworks have recently taken a small step in that direction, and as of R2016a there are APIs for packaging and installing toolboxes (http://se.mathworks.com/help/matlab/ref/matlab.addons.toolbox.installtoolbox.html). If you were able to write matlab.addons.toolbox.installToolbox('https://se.mathworks.com/matlabcentral/fileexchange/raven=0.8') just as you write install.packages() in R that would be really nice. It would also allow you to work more portable in that you can move between local/cluster/cloud and still be sure that the environment is the same.

I just put this here because I stumbled upon it, maybe someone wants to give it a shot at one point..

doc: update contribution guidelines

Description of the issue:

Leveraging on the work done for yeast-GEM, the contribution guidelines should be updated, to clarify the following:

  • when major, minor and patch versions are defined
  • how to exactly make a new release
  • ...

Question: should guidelines be stored in the repo root, in the .github folder or in the repo's Wiki?

help: duplicate InChi strings

Description of the issue:

*Try to be as clear as possible, e.g.:
Hi, when I exported the draft model which was reconstructed by merging the models from KEGG and MetaCyc, I got following warnings

Are there any possible ways to fix this? Thanks!

Reproducing this issue:

*If applicable, please attach the problematic code

%% 1. get model from KEGG
dataDir = 'prok90_kegg87';
fastaFile = '../Dataset/protein_seqeunces.fasta';
keggmodel = getKEGGModelForOrganism('halo',fastaFile,dataDir);
save('../Results/halo_keggmodel.mat','keggmodel');

%% 2. get model from Metacyc
metacycmodel=getMetaCycModelForOrganism('halo',fastaFile);
save('../Results/halo_metacycmodel.mat','metacycmodel');

%% 3. merge two draft models
model = combineMetaCycKEGGModels(metacycmodel, keggmodel);

%% 4. add g0000 for spontenouse reactions
model.genes(length(model.genes)+1) = {'g0000'};
for i = 1:length(model.rxns)
    if i<= length(model.grRules) && ~isempty(char(model.grRules(i)))
        new_grRules(i,1) = model.grRules(i);
    else
        new_grRules(i,1) = {'g0000'};
    end
end
model.grRules = new_grRules;

%% save model 
save('../Results/HaloCombinedDraftModel','model');
exportToExcelFormat(model,'../Results/HaloCombinedDraftModel.xlsx');
exportModel(model,'../Results/HaloCombinedDraftModel.xml');


WARNING: The following InChI strings are associated to more than one unique metabolite name:
	1S/C12H24N2O3/c13-9-5-1-3-7-11(15)14-10-6-2-4-8-12(16)17/h1-10,13H2,(H,14,15)(H,16,17)
	1S/C37H59N7O20/c1-13(30(51)44-20(35(58)59)9-10-23(48)43-19(8-6-7-18(38)34(56)57)32(53)40-14(2)33(54)55)39-31(52)15(3)61-29-25(42-17(5)47)36-60-12-22(63-36)28(29)64-37-24(41-16(4)46)27(50)26(49)21(11-45)62-37/h13-15,18-22,24-29,36-37,45,49-50H,6-12,38H2,1-5H3,(H,39,52)(H,40,53)(H,41,46)(H,42,47)(H,43,48)(H,44,51)(H,54,55)(H,56,57)(H,58,59)/p-2/t13-,14+,15+,18+,19-,20+,21+,22?,24+,25+,26+,27+,28+,29+,36?,37-/m0/s1
	1S/C5H11NO/c1-3-5(2)4-6-7/h4-5,7H,3H2,1-2H3/b6-4-/t5-/m0/s1
	1S/C5H9NO3/c7-3-1-4(5(8)9)6-2-3/h3-4,6-7H,1-2H2,(H,8,9)/t3-,4+/m1/s1
	1S/C6H10O6/c7-1-2(8)4(10)6(12)5(11)3(1)9/h1-5,7-11H/t1-,2-,3+,4+,5-
	1S/C6H12N2O3/c1-4(9)8-5(2-3-7)6(10)11/h5H,2-3,7H2,1H3,(H,8,9)(H,10,11)/t5-/m0/s1
	1S/C9H13NO4/c10-5(9(12)13)3-4-1-2-6(11)8-7(4)14-8/h4-5,7-8H,1-3,10H2,(H,12,13)/t4-,5-,7+,8-/m0/s1
	1S/C9H15NO4/c10-5(9(12)13)3-4-1-2-6(11)8-7(4)14-8/h4-8,11H,1-3,10H2,(H,12,13)/t4-,5-,6+,7+,8-/m0/s1

System information

  • Please report:
  1. RAVEN version: 2.0.4
  2. Operating system (Windows/Mac/Linux; include version): MacOS Mojave Version 10.14.1

I hereby confirm that I have:

Note: replace [ ] with [X] to check the box. PLEASE DELETE THIS LINE

An Mac bug in exportForGit.m

Description of the issue:

  • The function exportForGit.m cannot find raven cobra version in Mac system
  • ls(toolboxPath) in Mac system does not return the hiden file, so the while cycle will end with a toolboxPath:'/'

Reproducing this issue:

    while ~ismember({'.git'},ls(toolboxPath))
        slashPos    = getSlashPos(toolboxPath);
        toolboxPath = toolboxPath(1:slashPos(end-1));
    end

System information

  1. RAVEN version: devel branch
  2. Operating system: MacOS 10.13.5

I hereby confirm that I have:

Availability of HMMs for KEGG

Hello,

I tried to download the HMMs, but the message of 'biomet-toolbox.org is coming soon' showed up. I wondered whether there is another way to download these HMMs?

Thanks in advance,
Junhui

Output list type field as array of string in Yaml

The JSON/Yaml formats appear to be very useful in GEM curation, exchange and other things, among which one application is for visualization purpose. Now Yaml has been used as input for Metabolic Atlas website development.

However, the current writeYaml function treats the 'list' type fields (e.g. eccodes,subSystems) differently. It outputs the value as string when there is only one element, but output into array of string when there are multiple elements. This inconsistence causes problems when loading the file to public Yaml parser (such as in python), because the information cannot be correctly extracted if some elements have on value and others have multiple. Now this issue has been addressed in this branch: 81be856.

help: getKEGGModelForOrganism gives error that genes.pep cannot be found

Hello,
I am trying to reconstruct a bacterial model using the function 'getKEGGModelForOrganism' with organism id and protein fasta sequence as input. However, I am receiving the below error:

"The file 'genes.pep' cannot be located at / and should be downloaded from the KEGG FTP"

Kindly help in resolving this error.

edit: solution provided in #110 (comment)

The problem with exportToExcelFormat()

Hi Daniel,

I have been using exportToTabDelimite() to get an excel file out puts from "iAdipocytes1809.xlm". The function give an error of as below

exportToExcelFormat(model,'iAdipocytes1809.xls')

Error: File: exportToExcelFormat.m Line: 37 Column: 43
Unbalanced or unexpected parenthesis or bracket.

I checked the function documentation and it seemed it needs to provide a path for the directory to save which I did but it did give me the same error.

Then instead I used the exportToTabDelimited() function instead and it worked fine but the output files while it has the same structure of the models in excel format yet it also has a lot of html and SBML tags in in reaction pages which makes it difficult to follow up.

I would appreciate if you consider resolving the export functions problems to convert the model to the proper excel format.

Best,

Amir

Access kegg information from kegg API

Currently, there is no direct access to updated databases for use in downloadKEGG, getGenesFromKEGG, etc.). This may be resolved by updating the functions to use the kegg rest api. Also state clearly that the current functions use older version of the databases and add the older versions back to RAVEN.

style: rename function getINITModel

Description of the issue:

  • There is a RAVEN function getINITModel that reconstructs human GEMs based on omics data and/or predefined metabolic tasks. However, it appeares that the function name is inaccurate and misleading, given that the function actually implements the tINIT algorithm (described in MSB 2014;10:721) while there was a different/relevant algorithm INIT (described in PLoS Comput Biol. 2012;8:e1002518).

Expected feature/value/output:

  • To reflect the actual algorithm and avoid ambiguity to users and the community, this function name should be adjusted. And the citation in function description need to be changed from PLoS Comput Biol. 2012;8:e1002518 to MSB 2014;10:721
  • This function is NOT called by any other RAVEN functions. A straight forward solution is changing the name to gettINITModel
  • Developers and users (@danieljcook @JonathanRob ) are welcome to join the discussion, which may come up with an even better solution/name

I hereby confirm that I have:

  • Followed the guidelines to install RAVEN.
  • Checked that a similar issue does not exist

Excel models for tutorial have incorrect grRules

Description of the issue:

Reported by @biowilliam on Gitter.

  • The Excel models in the tutorial folder have grRules like (g14 or g16 or g5) while the genes should be in the style YAL001W, matching expression.txt.

Reproducing this issue:

Run tutorial2_solutions.m, line 99. repMets is blank as genes in the model and from expression.txt are in different format.

System information

  1. RAVEN version 2.0.0, or devel
  2. Windows 10, Matlab 2018a

I hereby confirm that I have:

bug: fillGaps errors when using KEGG as template model

Description of the issue:

KEGG and MetaCyc models were successfully merged using combineMetaCycKEGGModels().
Thereafter I try to fill gaps using the full KEGG model but received the following error.

Matrix index is out of range for deletion.

Error in removeMets (line 101)
reducedModel.metCharges(indexesToDelete)=[];

Error in simplifyModel (line 154)
reducedModel=removeMets(reducedModel,notInUse);

Error in fillGaps (line 181)
allModels=simplifyModel(allModels,false,false,false,true,false,false,false,[],true);

Reproducing this issue:

model_MetaCyc = getMetaCycModelForOrganism('fly','GCF_000001215.4_protein.faa',true,false,false)

model_KEGG=getKEGGModelForOrganism('dme','GCF_000001215.4_protein.faa','euk100_kegg82','output',false,false,false,10^-15);
[newModel, removedRxns]=removeBadRxns(model_KEGG,1,{'H+'},true);

CombinedDraftModel=combineMetaCycKEGGModels(model_MetaCyc, newModel);

keggModel=getModelFromKEGG([],false,false,false);

[newConnected, cannotConnect, addedRxns, balancedModel, exitFlag]=fillGaps(CombinedDraftModel,keggModel,true,false,false,[],params);

System information

  • Please report:
  1. RAVEN version (stabile release, devel branch?)
  2. Operating system (Windows/Mac/Linux; include version)

I hereby confirm that I have:

export model in GitHub-ready formats

Todo:

  • SBML format
  • MATLAB format
  • YAML format
  • COBRA-text format
  • combine all functionality in one function (exportForGit.m)
  • include versioning of toolboxes and packages
  • test functionality of exportForGit on existing GitHub-based models
  • document functionality in exportForGit function

Develop a function that automatically exports the model in formats that can directly be used to commit to GitHub following the SysBio-defined standards.

Exported formats are:

  • .xml SBML L3V1 FBCv2
  • .mat MATLAB structure
  • .txt COBRA-style text format
  • .yml YAML file

Futhermore include code from saveYeastModel for versioning etc.

fix: deletion of genes in "multiple rules"

Description of the issue:
After a short discussion in issue #78 of the yeast repo, I have been looking into how RAVEN handles gene deletions for complexes/isozymes/etc. The results are... well, not good:

  • When trying to use removeGenes.m to delete genes, the only ones that can be successfully deleted are genes in "single rules", i.e. only one gene in the grRules field. Any gene that is involved in a "multiple rule", e.g. gene1 and gene2, gene1 or gene2, etc. is not deleted by this function.
  • Furthermore, the output reducedModel looses about half the genes (probably the ones in those multiple rules, but I did not check).
  • Finally, findGeneDeletions.m does not call removeGenes.m (one would expect so to avoid redundant code), and has the major disadvantage that, as stated in the instructions of that function, it disregards complexes (gene1 and gene2) and assumes that any "and" is actually an "or". The latter is because it performs deletions by reading the rxnGeneMat field, which cannot distinguish between complexes and isozymes (it only has 1s and 0s).

The reader by now will have gathered that all of this is a huge issue when it comes to predicting KOs, and it should be fixed immediately.

Case study:
Loading the yeast model and testing removeGenes.m with a single rule (r_0003 <- YAL060W) and 2 multiple rules (r_0005 <- YGR032W or YLR342W and r_0013 <- YEL038W and YMR009W).

  • Expected output:
    Each time we use the function we should be able to remove the gene we want. Also, we should observe the following behaviors:

    • Delete YAL060W -> delete rxn r_0003
    • Delete isozyme YGR032W -> don't delete rxn r_0005 (but change rule to only YLR342W)
    • Delete isozymes YGR032W & YLR342W -> delete rxn r_0005
    • Delete subunit YEL038W -> delete rxn r_0013
  • Obtained output in devel:
    The only deletion that actually works is the gene in the single rule (YAL060W), all other changes are skipped. The flag removeRxnsWithComplexes was tested both =true and =false, with no differences in the output.

  • Reproducing these results:
    All this analysis can be seen in the private repo gem-scripts or can be downloaded by anyone here.

Proposed solution:
With @IVANDOMENZAIN we came up with a fix to all of this: Decide if the reaction should be removed or not based on grRules and not rxnGeneMat. We have written a quite concise function for that, that works as long as the grRules are properly formatted. In order to check that, we also have another function. However before we continue the work and make a PR that fixes all of this issue, we would like impressions and/or recommendations from the community :) Maybe my case study is incomplete, or the functions actually work sometimes with different input?

Please give here any feedback on this major issue, to solve it ASAP

setRavenSolver not working for Windows

It is not possible to append the choice of RAVEN solver to the MATLAB startup in Windows. This is due to the following code:

    up=pathdef;
    up=regexp(up,':','split');
    up=up{1};

Running pathdef on my Matlab in Windows gives something that starts like

D:\Box Sync\Documents\MATLAB;D:\Box Sync\Documents\GitHub\cobratoolbox

So setRavenSolver then ends up trying to make a file at location

D\startup.m

And this is of course an invalid location. I'm not sure where startup.m is supposed to be placed, so I can't suggest how to fix this for Windows.

Small comment, the error message you get from solveLP when no solver is set looks like this

Raven solver not defined or unknown. Try using setRavenSolver("solver").

Probably best to swap " with '.

bug: addRavenToUserPath nargin check

Description of the issue:

  • bug in addRavenToUserPath when it's called without any input arguments

Reproducing this issue:

addRavenToUserPath()

System information

  1. RAVEN 2.0.0

I hereby confirm that I have:

Issues error with getBlast.m

Hi,
I have a problem running 'getBlast.m'.
The command line I am running is:

blastStructure = getBlast('new_model', 'new_model.faa', {'old_model'}, {'old_model.faa'});

At line 128, when it is importing the results of the blast organism-to-model all the fields of structure array 'A' have only one row/element and I get the following error:

BLASTing "old_model" against "new_model"..
BLASTing "new_model" against "old_model"..
Index in position 2 exceeds array bounds (must not exceed 1).

Error in getBlast (line 131)
tempStruct.toGenes=A.textdata(:,2);

I think that the scripts is failing to correctly import the temp txt file and that the reason might be the gene names of the model fasta file that are numbers and not characters. Here is an example of the first 10 lines of that blast file:

Sm_00029441-RA,303034,1.00e-115,47.927,386,361,65.28
Sm_00029441-RA,300894,6.23e-20,24.825,286,92.4,42.66
Sm_00029441-RA,300885,7.26e-20,26.449,276,92.4,43.84
Sm_00029441-RA,306441,1.02e-19,30.556,288,90.9,46.18
Sm_00029441-RA,310017,1.68e-19,25.077,323,88.6,40.87
Sm_00029441-RA,Phatr3_J13617,3.18e-18,26.712,292,86.3,47.60
Sm_00029441-RA,300506,1.22e-17,26.596,282,85.9,42.20
Sm_00029441-RA,305513,2.77e-17,23.575,386,84.7,39.64
Sm_00029441-RA,309874,3.86e-17,27.517,298,84.0,41.61
Sm_00029441-RA,Phatr3_EG02341,1.78e-16,28.383,303,81.6,42.24
Sm_00029441-RA,Phatr3_EG02340,1.78e-16,28.383,303,81.6,42.24
Sm_00029441-RA,311654,2.88e-16,25.309,324,79.3,39.81

Could this be the reason? and how can I fix this issue? I cannot change the gene names in the fasta file because they won't match with the gene names of the model.
Many thanks!!

checkInstallation solver choice

It now sets the solver to gurobi, even if both gurobi and mosek failed. This seems misleading. Also, it doesn't check whether 'cobra' is an option. This should probably be coded as third solution. Finally, if libSBML fails then the solvers also fail. Instead, when libSBML fails the script should load a .mat model and attempt the solvers on those.

xlsx problem on Linux using checkTasks

Hi,
When running the checkTasks on Linux using e.g. the files provided by Blais et al ( http://www.nature.com/articles/ncomms14250 ), I run into troubles that the file cannot be read. Would it be possible to add an option to use tab separated value files instead of xls here?
I assume this is not an issue if there is a working excel installation on a windows machine, but even on windows this makes the toolbox reliant on excel being present.

bug: import/export lack support for models without genes

Error with importModel function:

I am trying to use the importModel function,

I keep getting the error below:

Index exceeds matrix dimensions.  
Error in importModel (line 850) if strcmpi(genes{1}(1:2),'G_')

What can I do?

Reproducing this issue:

drosomodel=importModel('1752-0509-3-91-S3.xml')

System information

  1. RAVEN version 2.0
  2. Operating system (Windows 10)

I hereby confirm that I have:

Static addition of POI to java class path

Coming from #55 I just wanted to test this on my machine and noticed, that RAVEN adds a static java class path file.
Personally, I think it would be better if this is not done in a static way (the user might move the RAVEN directory, he might want to use a different poi lib etc pp). If you want this to happen by default, you could create a startup.m in the RAVEN main directory that adds the librarys dynamically at startup if the toolbox is on the path or call it in checkInstallation.
This would also avoid the forced matlab restart during setup.

cnaProduce and gapReport fail on combineMetaCycKEGGModels

I combined metacyc and kegg models, then I discovered I can not apply the canProduce and gapReport function on the combined model. The following is the output error.

Reproducing this issue:

>> CombinedDraftModel=combineMetaCycKEGGModels(model_MetaCyc, balancedModel);

>> I=canProduce(CombinedDraftModel);
Matrix index is out of range for deletion.

Error in removeReactions (line 68)
            reducedModel.grRules(indexesToDelete,:)=[];

Error in simplifyModel (line 148)
            reducedModel=removeReactions(reducedModel,rxnsToDelete);

Error in haveFlux (line 45)
smallModel=simplifyModel(model,false,false,true,true);

Error in canProduce (line 25)
produced=haveFlux(model,10^-5,rxns);
 
>> [noFluxRxns, noFluxRxnsRelaxed, subGraphs, notProducedMets, minToConnect,...
    neededForProductionMat]=gapReport(CombinedDraftModel);
Gap analysis for COMBINED - Combined model from MetaCyc and KEGG draft models

Matrix index is out of range for deletion.

Error in removeReactions (line 68)
            reducedModel.grRules(indexesToDelete,:)=[];

Error in simplifyModel (line 148)
            reducedModel=removeReactions(reducedModel,rxnsToDelete);

Error in haveFlux (line 45)
smallModel=simplifyModel(model,false,false,true,true);

Error in gapReport (line 66)
I=haveFlux(model);

System information

  • Please report:
  1. RAVEN version 2.0
  2. Windows Operating system
    I hereby confirm that I have:

Update to 1.10 from 1.08

Hi all,

It seems that you based the "new" RAVEN on version 1.08, but the newest version is actually 1.10. This is because I used some RAVEN functionality while I was at Novo Nordisk and I also fixed bugs if I got mails about them. Something I should probably have mentioned earlier :)

The changelog for 1.08 to 1.10 says:

-Changed in loadSheet to allow for formulas in Excel sheets
-Fixed a bug in makeSomething/consumeSomething
-Moved the code for loading and processing Excel sheets to separate functions (loadSheet and cleanSheet) in order to prevent duplicate code
-The function loadWorkbook was added, mainly for easier loading of the Apache POI library from non-RAVEN functions
-The function addJavaPaths was added to use static Java paths instead of dynamic for better robustness
-References to the Matlab functions getfield and setfield were replaced with dynamic structures
-Some additional checks regarding Mosek licence
-Fixed a bug in simplifyModel when using the group linear option
-Changed in haveFlux so that it can deal with infinite bounds
-Added some more error checks to checkSolution
-An option was added to simplifyModel to identify reversible reactions which could only carry flux in one direction and change them to irreversible
-Added support for newer versions of HMMER to getKEGGModelForOrganism
-Fixed a bug in constructEquations
-Minor bug fixes and formatting in order to reduce the number of Matlab warnings in the editor

The most valuable part here is that handling Excel I/O was largely rewritten and should work better (and on all platforms) now. If we decide to merge RAVEN with my software for fermentor modelling (MEMO) then this update is required.

I don't know how to deal with this. I've merged RAVEN 1.10 with the master branch to the best of my ability in the branch 108_to_110. For now I have not included the calls to optimizeProb() because I want to test that everything seems to work first. It should of course be done though.

What do you think about this? Is there a large difference between the master branch and the devel branch(es)?

Cheers,
Rasmus

@edkerk @simas232 @shaqHosseini @varemo

help: importExcelModel fails importing metMiriams

Hello every one
I tried to install RAVEN 2.0.3 on matlab in order to use it for metabolic genome scale simulations of microbial communities but I faced difficulties doing so and I was not able run matlab codes.
The error in installation RAVEN toolbox is as follow:

when I type checkInstallation in matlab command window

Checking if it is possible to parse a model in Microsoft Excel format... FAILED

and error I get when trying to run a code is

Undefined function or variable 'index'.

Error in importExcelModel>parseMiriam (line 853)
            index

Error in importExcelModel (line 707)
    model.metMiriams=parseMiriam(model.metMiriams);

Error in SBMLFromExcel (line 30)
model=importExcelModel(fileName,false,printWarnings);

Error in Main (line 41)
SBMLFromExcel(draftFile,tmp);

Error in run (line 96)
evalin('caller', [script ';']);

I am using MATLAB R2016a. Thank you for your time and attention

help: what organism ID to use for getMetaCycModelForOrganism

Hello

I could generate the draft model using the function 'getKEGGModelforOrganism' with organism ID and protein sequence. However, I could not find the organism ID for my organism in the MetaCys to be used in the function 'getMetaCycModelForOrganism'. I would like to refine my model by combining both KEGG and MetaCys model. Kindly help.

Bapi.

Issue regarding getBlast

When trying to perform a bidirectional BLASTp between a query (.fa file) a target set (.fasta file) using the getBlast function, I get the following error message:

Error using importdata (line 136)
Unable to open file.

Error in getBlast (line 86)
A=importdata([outFile '_' num2str(i)]);

Is there anyone who has experienced a similar scenario, and are there any ways to fix this?

Not use extractfield in importModel

To reduce reliance on additional toolboxes (Mapping Toolbox), can we not use the extractfield function in importModel? See line 840:

   if isfield(modelSBML,'fbc_geneProduct')
       [rxnGeneMat, genes]=getGeneMat(grRules,extractfield(modelSBML.fbc_geneProduct,'fbc_id'));
   else
       [rxnGeneMat, genes]=getGeneMat(grRules);
   end

Bug:ravenCobraWrapper doesn't create same amount of metMiriams/rxnMiriams with mets/rxns

I use ravenCobraWrapper to convert a cobra model to raven format, it create less amount of metMiriams/rxnMiriams compared to mets/rxns. As below, you can see I have 2224 mets, but only have 2220 metMiriams.

model=

struct with fields:

               rxns: {3496×1 cell}
               mets: {2224×1 cell}
                  S: [2224×3496 double]
                 lb: [3496×1 double]
                 ub: [3496×1 double]
                rev: [3496×1 double]
                  c: [3496×1 double]
                  b: [2224×1 double]
              comps: {14×1 cell}
          compNames: {14×1 cell}
           rxnNames: {3496×1 cell}
            grRules: {3496×1 cell}
         rxnGeneMat: [3496×909 double]
         subSystems: {3496×1 cell}
            eccodes: {3496×1 cell}
         rxnMiriams: {2985×1 cell}
           rxnNotes: {3496×1 cell}
rxnConfidenceScores: {3496×1 cell}
              genes: {909×1 cell}
     geneShortNames: {909×1 cell}
           metNames: {2224×1 cell}
           metComps: [2224×1 double]
        metFormulas: {2224×1 cell}
         metMiriams: {2220×1 cell}
         metCharges: [2224×1 double]

Export as YAML

To facilitate development and versioning of GEMS in GitHub, it would be very beneficial if RAVEN can export models in YAML format, as this allows full diff of the whole XML file, instead of the more limited information that is given in the TXT export.

support for export to cobrapy-compatible yaml format

RAVEN should support writing a cobrapy-compatible yaml format, as this format is very concise.

  • make writeYaml function that writes a text file, and parses through the model structure.
  • remove old Yaml-export functionality
  • modify exportForGit to support new writeYaml function
  • confirm that output from this function is identical to cobrapy-yaml (except for RAVEN-unique fields)

A few considerations:

  • Use the correct format. This example is old, rather use this example of the yeast consensus network.
  • Do not reformat any ids to replace non-standard characters. This is done when writing SBML, as SBML doesn't support certain characters. But the YAML file should represent the model as it is in MATLAB.
  • Include as many annotations as possible:

For each metabolite, include:

  • mets
  • metNames
  • metComps
  • inchis
  • metFormulas
  • metMiriams (any)
  • metCharges
  • unconstrained
  • rxnFrom (?)

For each reaction, include:

  • rxns
  • rxnNames
  • rxnComps
  • metabolites and their stoichiometry
  • grRules
  • subSystems
  • eccodes
  • rxnMiriams (any)
  • rxnNotes
  • rxnConfidenceScores

For each compartment, include:

  • comps
  • compNames
  • compOutside
  • compMiriams

For each gene, include:

  • genes
  • geneComps
  • geneMiriams
  • geneShortNames

But of course only write those fields if they are present in the model.

feat: align RAVEN and COBRA model fields

There are many benefits in having RAVEN functioning as submodule of the COBRA toolbox. While RAVEN 2.0 is largely compatible with COBRA through use of the ravenCobraWrapper function and using unique function names, a complete alignment of model structure between the two toolboxes would be highly preferred.

To initiate this process, here are the definitions of the RAVEN fields.

The main discrepancies are:

  1. RAVEN documents compartments in the metComps field, instead of detailing this in metabolite IDs
  2. RAVEN documents annotations in metMiriams and similar fields, instead of metKEGGID and similar fields

How should these fields be unified? There is interest from COBRA to move towards the use of the metComps field, but what about other discrepancies? What other discrepancies exist?

Empty draft model from getModelFromHomology

I'm having some issues using the function getModelFromHomology. When using a template model, as well as a blast structure generated from getBlast as input arguments, the function only returns an empty draft model. I've made sure that the gene names in the blast structure is compatible with the gene names in the template model. Has anyone experienced similar problems with getModelFromHomology?

I am using:

  • MATLAB 2016b

  • Windows 10, version 1703 (OS-build 15063.674)

  • RAVEN Toolbox v. 2.0

error in getModelFromHomology

Hi,
I am trying to build a draft model for an organism of interest with the function getModelFromHomology.
The code I am running so far is:

blastStructure = getBlast('new_genome', 'new_genome.faa', {'old_genome'}, {'old_genome.faa'});
templateModelList{1} = old_model;
SMdraftModel = getModelFromHomology(templateModelList, blastStructure, 'new_genome');

But I am getting the following error:

Error using removeGenes>canRxnCarryFlux (line 104)
Error: Invalid expression. When calling a function or indexing a variable, use parentheses. Otherwise, check for mismatched
delimiters.

Error in removeGenes (line 62)
canCarryFlux(j) = canRxnCarryFlux(reducedModel,grRule,genes{i});

Error in getModelFromHomology (line 312)
models{useOrderIndexes(i)}=removeGenes(models{useOrderIndexes(i)},~a,true,true,false);

I started to look into the script lines to see what could be the issue but without success. Can you help me to figure out a solution?

I am using RAVEN 2.0 and running it on Ubuntu 18.04.

Cheers,
Luca

bug: getBlastFromExcel doesn't parse all relevant fields

I don't have the protein sequence of the organism of interest, but I have the cDNA sequence. So I did blastn between my organism with human (the template organism, and I downloaded its cDNA sequence from Biomart), and I generated an Excel file from the output of blastn, formatting the file as stated in the function "getBlastFromExcel" with the E-value, alignment length, identity. The output of "getBlastFromExcel" is "blastStructure". In the next step, I passed "blastStructure" to the function "getModelFromHomology", but it complains about the line 94, which requires "bitscore":

blastStructure(i).bitscore(~indexes)=[];

(Of course, the next line will also be complained later, since it requires "ppos" which is not in "getBlastFromExcel" neither).

But the function "getBlastFromExcel" won't take anything other than e-value, alignment length, identity, therefore its output "blastStructure" will not have "bitscore". It will only take the following elements:

            blastStructure(end).fromGenes(I)=[];
            blastStructure(end).toGenes(I)=[];
            blastStructure(end).evalue(I)=[];
            blastStructure(end).aligLen(I)=[];
            blastStructure(end).identity(I)=[];

System information

  • Please report:
  1. RAVEN version: 2.0.4
  2. Operating system: MacOS High Sierra 10.13.6

I hereby confirm that I have:

help: getKEGGModelForOrganism download pre-trained HMMs

While following the RAVEN tutorial I was unable to download the model from KEGG through 'ftp'. Each time it throws an error like :
 

"/pub/kegg/ligand/compound/" is nonexistent or not a directory.
 
I would be very glad if someone could show me a way around to fix the issue.

Thanks,
Bapi Mandal.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.