mibig-secmet / mibig-json Goto Github PK
View Code? Open in Web Editor NEWRepository to track changes in MIBiG curation data stored in JSON format
Repository to track changes in MIBiG curation data stored in JSON format
Affected BGC
BGC0000065
Describe the error
Comment field contains an escaped newline character.
BGC0001668
I wanted to suggest updating the information on mic cluster to include the compound dehydromicroperfuranone which is the product of micA(ANIA_03396) and micB (ANIA_03394 ) SMILES: O=C1C(CC2=CC=CC=C2)=C(C3=CC=CC=C3)C(O1)=O .
Also updating on ANIA_03394 : Function: cytochrome P450 monooxygenase Evidence: Gene overexpression
To support this, also adding in the literature Roux et al 2020 (reference below).
I also would suggest referring to "EQ-4" as microperfuranone, as this is the name of the compound used in Yeh et al 2012 and the rest of Aspergillus literature. Thanks!
CRISPR-Mediated Activation of Biosynthetic Gene Clusters for Bioactive Molecule Discovery in Filamentous Fungi
Indra Roux, Clara Woodcraft, Jinyu Hu, Rebecca Wolters, Cameron L. M. Gilchrist, and Yit-Heng Chooi
ACS Synthetic Biology 2020 9 (7), 1843-1854
DOI: 10.1021/acssynbio.0c00197
MIBiG ID
BGC0001668
Short summary
Add new publication.
Change name of compound.
New publication confirms the role of MicB (P450) in pathway
Name of compound for this entry is currently EQ-4, following the name in NPAtlas (discovered in another fungus), though it's known as microperfuranone in literature. This new paper names the product of MicA-B as dehydromicroperfuranone (not in NPAtlas)
Literature references
PMID: 32526136
Affected BGC
BGC0001018
Describe the error
I believe this BGC has been misannotated as a hybrid NRP/Polyketide. I don't see any evidence for PKS from the biosynthesis described in the paper (although the existence of hybrid mycrocystins are discussed)/ there are no PKS genes are in the cluster.
Affected BGC
BGC0001476
Describe the error
Wrong citation link, want either PMID: 29551347 or DOI: 10.1016/j.chembiol.2018.02.008
Affected BGC
BGC0000236
Describe the error
NCBI loci has changed from AY228175.1=>AH012623.2
See message in https://www.ncbi.nlm.nih.gov/nuccore/AY228175.1
Fixed in a branch: PR incoming.
Affected BGC
BGC0000335
Describe the error
Misclassified as NRP. Stated in the abstract of the paper:
The gene cluster for cystomanamide biosynthesis was identified by gene disruption as PKS/NRPS hybrid
Affected BGC
BGC0001967
Describe the error
"publications": [
- "doi:10.1128/AEM.01292-1"
+ "doi:10.1128/AEM.01292-19"
]
MIBiG ID
BGC0002009 (kanglemycin)
Short summary
Add the Antibacterial
compound activity to kanglemycin.
Literature references
(this is the paper supporting the BGC0002009 annotation, and it described kanglemycin as having an antibacterial activity against Mycobacterium tuberculosis).
Affected BGC
BGC0000553
Describe the error
Cluster is reported to be complete
, but the cluster doesn't contain a LanKC enzyme, which is required for an AmfS-like class III lanthipeptide.
Affected BGC
BGC0001077
Describe the error
Cluster is not a terpene BGC
Affected BGC
BGC0001284
Describe the error
Misclassified as a Terpene. Should be reclassified as Polyketide, this is right in the title of the paper: https://www.ncbi.nlm.nih.gov/pubmed/26025896
BGC0000723
The compound produced by this BGC should be validamycin A
, not β-D-galactosylvalidoxylamine-A
; the citation reports the discovery of validamycin A
and does not mention the latter anywhere.
Affected BGC
BGC0001336
Describe the error
Formula of erdasporine A
is invalid, the formula of erdasporine C
is listed twice.
Affected BGC
BGC0001891
Describe the error
The coordinates for the cluster seem off, when looking at the cluster described in https://www.pnas.org/content/early/2020/04/06/1918759117 orf4
starts at 3,800,943 and orf1
ends at 3,818,561 (cluster is in opposite direction on the genome compared to paper).
Affected BGC
BGC0001230
Describe the error
The annotation as a hybrid Polyketide/NRP is likely incorrect / or at least inconsistent with related cluster BGC0001766 (both salinamide producers). There is one KS-associated gene, but the paper doesn't mention any clear biosynthetic role: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4363298/
Affected BGC
BGC0001966
Describe the error
Perhaps should be re-classified as NRP/PKS hybrid (currently NRP-only)
From the paper:
We describe the NRPS-t1PKS cluster ‘BIIRfg’... which involves the formation of the lipid part by BIIRfg_PKS
Affected BGC
BGC0001497
Describe the error
Annotation is Polyketide. Should probably be NRP (see homolog cluster BGC0001442, which is correct).
Affected BGC
BGC0000535
Describe the error
Annotation in the "genes" tab lists genes twice. Once with the locations from the GenBank file, once as "external" with the manual annotations.
Affected BGC
BGC0002025 and BGC0002026
Describe the error
Missing hybrid Polyketide biosynthetic class: thaxteramides identified as hybrid PKS/NRPS in the abstract of https://www.ncbi.nlm.nih.gov/pubmed/31184172
Affected BGC
BGC0000176
Describe the error
Classified as PKS/NRPS, but the structure doesn't seem to contain any NRPS part.
Affected BGC
BGC0001961
Describe the error
The reference DOI is 10.1039/C8SC05670
, but should be 10.1039/C8SC05670F
.
Thanks to Dr Nicola Convine from the Systems Support Team of the Royal Society of Chemistry for the initial report.
Affected BGC
BGC0001878
BGC0001882
BGC0001891
BGC0001896
BGC0001909
BGC0001913
BGC0001915
BGC0001926
BGC0001930
BGC0001938
BGC0001942
BGC0001946
BGC0001956
BGC0001957
BGC0001961
BGC0001967
BGC0001969
BGC0001984
BGC0001985
BGC0001986
BGC0001995
BGC0001997
BGC0001999
BGC0002011
BGC0002012
BGC0002018
BGC0002034
BGC0002036
Describe the error
I initially reported this problem in issue #21, but it's more pervasive: for many [all?] of the recently added clusters, DOI linkouts to publications are missing the final character. These hyperlinks are, of course, generating errors from doi.org.
There is an additional problem with BGC0001878: the bioarxiv link, but not the doi, contains the semantic versioning tag. (10.1101/445270v1 => 10.1101/445270).
I've fixed the BGCs listed above (PR incoming), but this may not be comprehensive.
Affected BGC
BGC0001857/BGC0001978
Describe the error
Related burnettramic acid-producing clusters are annotated as different classes. The title of the paper identifies class as NRPS/PKS https://pubs.acs.org/doi/10.1021/acs.orglett.8b04042.
BGC0001978 is correct but BGC0001857 is classified as Polyketide/Alkaloid.
BGC0000341
Hi there,
The structure for this cluster (encoding enduracidin production) is not correct. Everything else looks fine to me.
Best,
Logan
--
Logan W. MacIntyre
Postdoctoral Fellow
Laboratory of Genetically Encoded Small Molecules
The Rockefeller University
Affected BGC
BGC0001395
Describe the error
Structure in entry (and pubchem) are for teichoplanin, not teichomycin.
Affected BGC
Describe the error
Both BGCs are reported to produce a compound named fusarin
of SMILES C/C=C(/C)\C=C(/C)\C=C(/C)\C=C\C=C(/C)\C(=O)[C@@]12[C@@H](O1)[C@](NC2=O)(CCO)O
. However, the references for both clusters indicate they actually produce fusarin C
, which SMILES is C/C=C(\C=C(/C)\C=C(/C)\C=C\C=C(/C)\C(=O)[C@@]12[C@@H](O1)[C@](NC2=O)(CCO)O)/C(=O)OC
.
Affected BGC
BGC0001839
Describe the error
Cluster is missing many genes. See complete squalestatin cluster BGC0001339.
Publication is incorrect or incomplete; the only reference is related to BGC0001339 (which describes the cluster in Phoma sp and never mentions Aspergillus Z5)
Affected BGC
BGC0000386
Describe the error
The displayed genomic region is from the wrong chromosome. Displayed is 2,748,130 - 2,778,630 from CP000085.1. while the biosynthetic genes for the NRPS are located on CP000086 (https://www.ncbi.nlm.nih.gov/nuccore/CP000086): ABC36450 and ABC37099.
MIBiG ID
BGC0001845
Short summary
Comment mentions a Nat. Prod. Commun. paper from 2013 that should be in the publications list
Literature references
PMID 24354182
Affected BGC
BGC0001125 OR
BGC0001951-BGC0001953
Describe the error
Puwainaphycin/Minutissamide clusters clearly have genes with KS domains. A 2014 Plos One paper classifies BGC0001125 as NRP/PKS. However, clusters BGC0001951-BGC0001953 are listed on Mibig as only NRP. A 2019 AEM paper discusses the PKS genes in the abstract but only uses lipopeptides classifier. This is a judgement call, but should be consistent.
Affected BGC
BGC0000932 and BGC0001362
Describe the error
double entry of the tropodithietic acid biosynthetic gene cluster from Phaeobacter inhibens DSM 17395; could the entries be merged as they contain different information?
Thanks, Eva
Affected BGC
BGC0001346
Describe the error
No valid citation for the record.
Affected BGC
BGC0001916 and BGC0001649
Describe the error
Exact duplicates.
Affected BGC
BGC0001932
Describe the error
Duplicate of BGC0001662, just with a typo in the compound name.
Affected BGC
BGC0001310
Describe the error
Listed as a terpene, but is a PKS type III
Affected BGC
BGC0001982
Describe the error
This is an exact duplicate of BGC0001881
Affected BGC
Please give the BGC number of the affected cluster (e.g. BGC1234567).
BGC 0001773
Describe the error
A clear and concise description of what the bug is.
2 genes are missing from the cluster: selE and selDIII. Both are just beyond the current boundary of the cluster at selI. With these additions the pathway is complete.
Affected BGC
BGC0001659
Describe the error
NPAtlas link does not match compound when searching by name
Affected BGC
Describe the error
All these clusters have the PMID:27813109 (Isolation of a new antibacterial peptide actinokineosin from Actinokineospora spheciospongiae based on genome mining), but this is actually the paper reference for BGC0001496, the actinokineosin
cluster.
The right paper is most likely doi:10.1002/ajoc.201700433 (Isolation and Structure Determination of New Antibacterial Peptide Curacomycin Based on Genome Mining).
Affected BGC
BGC0001804
Describe the error
BGC0001378 and BGC0001804 describe the same cluster from the same GenBank record
Affected BGC
BGC0000315
Describe the error
In the gene annotation tab, the evidence for function always contains "activity assay" twice when the other evidence is "knock-out"
Affected BGC
BGC0000145
Describe the error
This probably should be reclassified as NRP/Polyketide hybrid, to be consistent with BGC0001041 (also a salinosporamide A producer)
MIBiG ID
Short summary
Species for this BGC is currently listed as Streptomyces diastatochromogenes Tü6028. Should be Candidatus Entotheonella factor
Literature references
Please give the publications you base your additions on (DOI or PMID).
PMID: 24476823 DOI: 10.1038/nature12959
Affected BGC
BGC0001072
Describe the error
PKS module descriptions are incorrect.
Affected BGC
BGC0001415
Describe the error
Probably should be reannotated as hybrid NRP/PKS. From the abstract of the paper:
[Bioinformatics] predicted a hybrid non-ribosomal peptide synthetase-polyketide synthase (NRPS-PKS) assembly line
Affected BGC
BGC0001044
Describe the error
Classified as a Polyketide + NRPS, but no PKS genes present and the first reference only mentions a single ketide starter unit.
MIBiG ID
BGC0000147
Short summary
Based on structure and release type, the PKS subclass should be "macrolactone"
Affected BGC
BGC0000548
Describe the error
Not sure what structure we link to, but it's not Salivaricin A
Affected BGC
BGC0002055-BGC0002057
Describe the error
I think there are annotation errors in the json files added today by @satriaphd (commit 41401a0)
BK012115.1
does not exist, and seems inconsistent with the loci of the other two records (JAAHTG010000029.1
, JAAHTH010000242.1
)A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.