Comments (5)
For ChEBI (2018-12-03):
- No. of compounds: 46765
- No. of compounds without inchi: 7075
- No. of compounds with inchi: 39690.
- No. of unique inchis: 38946
So, we don't have an InChI for all of them and we have compounds with the same InChI! Apart from the name and the ID these compounds are however identical:
compound_id compound_name
8564 CHEBI:17775 7,9-dihydro-1H-purine-2,6,8(3H)-trione
18506 CHEBI:46811 2,6-dihydroxy-7,9-dihydro-8H-purin-8-one
18507 CHEBI:46814 9H-purine-2,6,8-triol
18509 CHEBI:46817 7H-purine-2,6,8-triol
18513 CHEBI:46823 1H-purine-2,6,8-triol
27249 CHEBI:62589 6-hydroxy-1H-purine-2,8(7H,9H)-dione
inchi
8564 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
18506 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
18507 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
18509 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
18513 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
27249 InChI=1S/C5H4N4O3/c10-3-1-2(7-4(11)6-1)8-5(12)9-3/h(H4,6,7,8,9,10,11,12)
inchi_key formula mass
8564 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
18506 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
18507 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
18509 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
18513 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
27249 LEHOTFFKMJEONL-UHFFFAOYSA-N C5H4N4O3 168.028
>
Question is whether these compounds would have different MS2 spectra? If so it would not make sense to combine them!
Some of the compounds without an inchi are listed below:
compound_id compound_name inchi inchi_key
3 CHEBI:10003 ribostamycin sulfate <NA> <NA>
15 CHEBI:10036 wax ester <NA> <NA>
91 CHEBI:10283 2-hydroxy fatty acid <NA> <NA>
140 CHEBI:10545 electron <NA> <NA>
148 CHEBI:10583 kappa-carrageenan <NA> <NA>
154 CHEBI:106304 sphingomyelin d18:1/16:0 <NA> <NA>
formula mass
3 C17H34N4O10.(H2O4S)n NA
15 CO2R2 43.990
91 C2H3O3R __ C2H3O3R(CH2)n 75.008
140 <NA> 0.000
148 (C12H17O12S)n NA
154 C39H79N2O6P 702.568
from compounddb.
In the case of CHEBI:46814 and CHEBI:46817 for instance (and I suspect the rest of them) then they are not the same chemical at first glance (see below, different locations of a hydrogen), but in fact they are tautomers of each other. This is also indicated in the CHEBI entries of some of them if you look them up in CHEBI. That means they readily convert from one to the other without any external input (energy or otherwise) and thus should really be thought of as a mixture of all of them. The MS2 spectrum "should" be similar if not identical, buut the actualy ionization conditions (pH, buffer ions etc) might also have a big effect leading to different MS2 spectra.
Here I would suggest to get input from people that are actually working with tautomers to hear what they have to say about it.
from compounddb.
Thanks for your input @SiggiSmara ! I'll try to get some input from people actually working with MS2 spectra and identification.
from compounddb.
I have no experience with tautomers but one option could be to use the SMILES where this is explicit. You can also generate a non-standard InChI with the fixed-H layer from the SMILES.
from compounddb.
Had also feedback from Steffen. They use the same approach than pubchem: a compound table with unique InChI and a substance table with additional annotations (eventually multiple entries per compound).
from compounddb.
Related Issues (20)
- Create an IonDb with all theoretical adducts for a CompoundDb
- Function to combine/concatenate CompDb databases
- Add an insertCompound function HOT 1
- Availability of functions to create empty CompDB and insert and delete compounds HOT 2
- Seeking suggestions for database development (HMDB version 5) HOT 2
- Pass skipErrors to read.SDFset through compound_tbl_sdf? HOT 4
- Replace mass2mz and mz2mass with the ones from MetaboCoreUtils
- Rename table "compound" into "ms_compound"
- Implement a StandardsDb that extends CompoundDb HOT 11
- Update the MsBackendCompDb HOT 1
- insertIon method HOT 4
- non integer msLevel values HOT 3
- Add possibility to delete (inserted or existing) ions or MS/MS spectra
- Prepare for Bioconductor submission HOT 3
- Arbitrary columns in insertIon
- Transfer of CompoundDb
- Support import from MoNa MSP files HOT 1
- custom Db - spectra and ion Db questions HOT 2
- Issue with GitHub install of CompoundDb HOT 3
- mass2mz method for CompDb HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compounddb.