Code Monkey home page Code Monkey logo

Comments (22)

nilshoffmann avatar nilshoffmann commented on August 30, 2024 1

Just for reference: the standard specification contains examples for the possible values in the different sections and elements for mzTab-M 2.0:
https://hupo-psi.github.io/mzTab/2_0-metabolomics-release/mzTab_format_specification_2_0-M_release.html#format-specification

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

Thanks Johannes, just adding some details to this:

  • The validator is not able to parse the different tables from our .mztabm file, (even when adding "fake" sml data or a single null entry as Johannes said), would you know why that would be the case ?

  • For Johannes second question, there is also this problem for the id_confidence_measure[1], as we don't identify anything in xcms this would need to be null but it is not allowed by the format.

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

@jorainer @philouail Which validator are you using? We could think about relaxing the SML section requirement to enable usage of mzTab-M as an intermediate format.

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

This is the link to the validator: https://apps.lifs-tools.org/mztabvalidator/
would you know of another one ? I'm also worried that something is wrong with our file.

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

The URL is the right one. Leaving the summary and evidence tables out altogether will fail the parse. Would it be possible for you to share an example file with me so that I can give you direct feedback on it?

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

@jorainer Concerning semi quantitative abundances: small_molecule-quantification_unit only applies to the values in the SML table, while small_molecule_feature-quantification_unit applies to the values reported in the SMF table. In the default semantic validation mapping file (see https://github.com/HUPO-PSI/mzTab/blob/master/specification_document-releases/2_0-Metabolomics-Release/mzTab_2_0-M_mapping.xml, only applied if you use semantic validation mode), they can have cv terms that are children of any of the following root terms:

<CvTerm termAccession="PRIDE:0000392" useTerm="false" termName="Quantification unit" isRepeatable="false" allowChildren="true" cvIdentifierRef="PRIDE"></CvTerm>
<CvTerm termAccession="UO:0000051" useTerm="false" termName="concentration unit" isRepeatable="false" allowChildren="true" cvIdentifierRef="UO"></CvTerm>
<CvTerm termAccession="MS:1000043" useTerm="false" termName="intensity unit" isRepeatable="false" allowChildren="true" cvIdentifierRef="MS"></CvTerm>
<CvTerm termAccession="UO:0000006" useTerm="false" termName="substance unit" isRepeatable="false" allowChildren="true" cvIdentifierRef="UO"></CvTerm>

Children of PRIDE:0000392
Children of UO:0000051
Children of MS:1000043
Children of UO:0000006

For your particular use-case, following how mzML annotates "intensity" values, you should hopefully be able to use any of the children of MS:1000043. If non of them fit, could you check if the MS CV contains a more suitable term? We can then update the semantic validation mapping file.

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

Here is an example
test.mztab.txt

I had to switch it to .txt because github does not support .mztab, if you want to original i can send it by email.

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

Thanks for the test file, I will check it and report back what we may do with minimal changes to the file. We may need to publish an amendment to the standard, if we decide to relax some requirements to strong recommendations. I will need to update the parser / validator implementation jmztab-m soon anyway, since the OLS 3 validation endpoint no longer appears to work for me. A development version of the web-based validator is now deployed at https://apps.lifs-tools.org/mztabvalidator-dev

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

Amazing thanks for the help !

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

I have updated your file, which now passes validation (in principle):
xcms-test-export.mztab.txt

Validation results (basic and semantic) with the current default mapping file are available here:
https://apps.lifs-tools.org/mztabvalidator/result/a07569b1-3ba5-493b-95d7-bac34ce667b3

The errors shown are all only due to the semantic validation mapping file having required terms, we can create a custom xcms semantic validation mapping file to facilitate further adoption.

This is just a first draft, though. For now, I have added 1 SML entry that does not link to any grouped feature entry, but without abundances. Depending on the workflow, subsequent tools would be able to read the features, run an identification step and record the results in a new mzTab-M file that then contains a proper SML table.
Please note that I added a charge of 1 in the SMF table to all features. This is the (net) charge (positive integer) of the ion / m/z. Not sure if your workflow allows determination of the charge of features and adducts at this level of analysis. If not, please let me know, we should be able to adapt how this is handled.

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

Validation result with the following relaxed semantic validation file yields only info level messages:
https://apps.lifs-tools.org/mztabvalidator/result/bf180e3b-7e67-4fcb-8dc3-a008027d4d10

An adapted semantic validation mapping file for feature-only files based on XCMS is available here:
https://github.com/HUPO-PSI/mzTab/blob/master/examples/2_0-Metabolomics_Release/mzTab_2_0-M_mapping-xcms.xml

from rmztab-m.

jorainer avatar jorainer commented on August 30, 2024

Am a bit late for the party ;)

Please note that I added a charge of 1 in the SMF table to all features. This is the (net) charge (positive integer) of the ion / m/z.

after xcms preprocessing we actually only have a feature table with abundances (semi-quantitative) - each feature being characterized by an m/z and retention time value. No additional information (like charge etc) are available at this stage (we would assume that most features have charge one - but we don't know). That, along with other information like adduct or compound annotation would needed to be added by a separate software further downstream in the analysis.

Don't know what's better here - changing the definition of mzTab-M or simply putting (like you did) some best-guess defaults.

For our test file, did I understand correctly that you had to tweak/change the validator to be able to read a mzTab-M without SML?

from rmztab-m.

nilshoffmann avatar nilshoffmann commented on August 30, 2024

Understood, that would mean that we need to change charge to optional (nullable) in the spec and update the schema and validator implementation. I would not recommend to put in best guesses, that might lead to confusion about what is meant and without a clearly defined way of encoding this kind of information, people and tools will pick a way to interpret it.

For the example I provided two days ago, I did not adapt the validator, just your file and provided a different semantic mapping file. All linked further up in this thread. But to be able to validate your file without charges, we will need to alter the spec + schema + implementation.

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

Hi Nils,

Thanks for the feedback and the help.
I completely agree with your perspective. I believe making certain elements nullable would facilitate a more intermediate style file format, which aligns better with xcms. Just wanted to summaries the points that need to be addressed to create this intermediate style format.

Regarding the mzTab recommendation found here: https://hupo-psi.github.io/mzTab/2_0-metabolomics-release/mzTab_format_specification_2_0-M_release.html#metadata-section and the validator. There are a few points to address:

  • small_molecule-quantification_unit Although it's marked as mandatory, it seems unnecessary for xcms results if SML has no inputs.
  • small_molecule-identification_reliability This was added to our file but is only mandatory in certain cases and not relevant for xcms results.
  • id_confidence_measure[1-n] Similarly, this is mandatory but not relevant for xcms outcomes.

What do you think ? I believe these requirements could probably be relaxed in both the file format definition and validator.

Furthermore:

  • reliability in SML: it's nullable in the mzTab file format definition, and it would make sense to change it to optional in the validator for xcms.
  • charge in SMF: Making it optional in the validator, as you suggested, is also necessary.

I will adapt my code for some of the other changes that you made that make sense in the context of xcms. But in term of general structure, it was fine for you ? Would you rather we wait until the validator is adapted for us to publish this export method for xcms results?

from rmztab-m.

philouail avatar philouail commented on August 30, 2024

Ideally this would be the intermediate file that xcms would provide:
test.mztab.txt

How does this looks ? (mainly changed the metadata so it makes sense for an xcms output). do you want us to force the input of some other variable ?
Also to be noted that we allow to pass optional column in the SMF as xcms can provide more information than the file format ask for. it would of course follow the required format of opt_column_name

from rmztab-m.

sneumann avatar sneumann commented on August 30, 2024

Hi @nilshoffmann,
how are other softwares handling this ? We have examples from MS-Dial in gcms_tms_height, which also had to shoe-horn unidentified features into SML.
I haven't found an mzMine3 example yet.
Yours, Steffen

from rmztab-m.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.