Code Monkey home page Code Monkey logo

Comments (6)

jorainer avatar jorainer commented on August 17, 2024 1

What you observed/described above is the default behaviour of the compounds function: it uses by default distinct in the SQL call, thus returning only unique results. To ensure all formulas are returned (even duplicated ones) I would suggest to use compounds(cmp_db, c("compound_id", "exactmass")).

from compounddb.

jorainer avatar jorainer commented on August 17, 2024

@RogerGinBer are you currently working on this? That would be totally fine for me - contribution in form of a PR (pull request) highly welcome

from compounddb.

RogerGinBer avatar RogerGinBer commented on August 17, 2024

Yes, I'm working on it 👍
I believe we should also have a mass2mz method (or perhaps a different function, like formula2mz), that, given a list of formulas and adducts, calculates each formula's neutral mass and then calls mass2mz to generate a formula-adduct mz matrix.
This way, the CompDb method just has to extract the formulas from the object and call formula2mz. Would that make sense?

from compounddb.

jorainer avatar jorainer commented on August 17, 2024

The CompDb provides already masses (monoisotopic), so it would be a little computational overhead to first calculate the masses from the chemical formulas. So I would maybe start with a mass2mz first.

Actually, a formula2mz might be a nice addition for MetaboCoreUtils - it could simply combine the MetaboCoreUtils::calculateMass function (that can also calculate masses from e.g. "[13C3]C3H12O6") and then calculate m/z using the MetaboCoreUtils::mass2mz. If interested you could do a PR with that function in MetaboCoreUtils?

from compounddb.

RogerGinBer avatar RogerGinBer commented on August 17, 2024

Sure thing! I read in the create-compounddb vignette that the exact_mass column could have NA values, so that's why I thought of doing it from the formula. But yes, makes more sense to use the mass directly, when available

Yes, I'll open an issue at MetaboCoreUtils and start a PR on that

from compounddb.

RogerGinBer avatar RogerGinBer commented on August 17, 2024

Also, I've found some unexpected behavior with the compounds accessor using the HMDB example CompDb:
compounds(cmp_db, "exactmass") gives only 8 results (removes one duplicate 104.0473), while compounds(cmp_db)$exactmass returns them all 9.

Is this how it's supposed to work?

from compounddb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.