Comments (8)
Well, this is why I'm working here with the MassBank record format. It is rich in metadata and human readable, but also easy to parse due to partially controlled variables. The functions I wrote are reading from this format to a Spectra
object and then will also write to a MassBank record.
By the way: I implemented two new spectra comparison methods in masstrixR
. One is a standards forward score aka dot product, but aligns the spectra instead of binning and the second one is a reverse score (reverse dot product), which uses only peaks that are in the library spectrum. If the match is good both should be quite high, if the forward is low and the backward high then you have a lot "contaminating" peaks in your query spectrum or it is just the wrong hit.
from compounddb.
Yes, those don't have spectra.
from compounddb.
Little complication from HMDB: HMDB provides one xml file for each spectrum associated with a compound. Now it can be that the same spectrum (same values) are associated to different compounds. HMDB uses the same spectrum_id
, but provides two (or more) xml files, one for each compound_id.
Complicated solution to handle this would be:
- insert only unique spectra to the
spectrum
table. - Add an additional table providing the mapping between spectrum and compound tables (to handle the n:m mapping).
Disadvantage: queries are more complicated, possibly slower.
Simple solution:
- insert each spectrum as it is provided, but assigning own, internal and unique, IDs to each spectrum.
from compounddb.
I have at least a function that can read from MassBank records. Well that doesn't help with MoNA but with all other MassBank related records.
Check my masstrixR
package later that day. There is a branch called masstrixR_RaMoNA_merge
. It is based on our in-house tool MassTRIX [1]. There might be also some other usefull functions we can use / reuse.
[1] http://dx.plos.org/10.1371/journal.pone.0039860
from compounddb.
Cool! thanks for your input @michaelwitting . I had a look at the MoNa SDF file and it should be straight forward to extract all relevant information (compound annotations and spectra) from that. It's just a bummer that every database/resource uses own identifiers and nomenclature.
from compounddb.
when you talk MassBank record format - where do you get that data? Is it from https://github.com/MassBank/MassBank-data ? apparently not MoNa...
from compounddb.
Yes, for example. We use also the MassBank format for our internal database.
Regarding the MoNA JSON: It is very inconsistent. When I read some data from their webservice I'm having difficulties to get the entries I would like to access. Not every library they have has exactly the same format. Maybe it is different with the JSON files...
from compounddb.
Import of open data from MassBank is discussed in issue #34.
from compounddb.
Related Issues (20)
- Create an IonDb with all theoretical adducts for a CompoundDb
- Function to combine/concatenate CompDb databases
- Add an insertCompound function HOT 1
- Availability of functions to create empty CompDB and insert and delete compounds HOT 2
- Seeking suggestions for database development (HMDB version 5) HOT 2
- Pass skipErrors to read.SDFset through compound_tbl_sdf? HOT 4
- Replace mass2mz and mz2mass with the ones from MetaboCoreUtils
- Rename table "compound" into "ms_compound"
- Implement a StandardsDb that extends CompoundDb HOT 11
- Update the MsBackendCompDb HOT 1
- insertIon method HOT 4
- non integer msLevel values HOT 3
- Add possibility to delete (inserted or existing) ions or MS/MS spectra
- Prepare for Bioconductor submission HOT 3
- Arbitrary columns in insertIon
- Transfer of CompoundDb
- Support import from MoNa MSP files HOT 1
- custom Db - spectra and ion Db questions HOT 2
- Issue with GitHub install of CompoundDb HOT 3
- mass2mz method for CompDb HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compounddb.