Code Monkey home page Code Monkey logo

mibig-json's People

Contributors

althonos avatar bthedragonmaster avatar kblin avatar marnixmedema avatar mmzdouc avatar satriaphd avatar sjshaw avatar zdk123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mibig-json's Issues

Error with BGC0000065

Affected BGC
BGC0000065

Describe the error
Comment field contains an escaped newline character.

BGC0001668 enhancement

BGC0001668

I wanted to suggest updating the information on mic cluster to include the compound dehydromicroperfuranone which is the product of micA(ANIA_03396) and micB (ANIA_03394 ) SMILES: O=C1C(CC2=CC=CC=C2)=C(C3=CC=CC=C3)C(O1)=O .
Also updating on ANIA_03394 : Function: cytochrome P450 monooxygenase Evidence: Gene overexpression
To support this, also adding in the literature Roux et al 2020 (reference below).
I also would suggest referring to "EQ-4" as microperfuranone, as this is the name of the compound used in Yeh et al 2012 and the rest of Aspergillus literature. Thanks!

CRISPR-Mediated Activation of Biosynthetic Gene Clusters for Bioactive Molecule Discovery in Filamentous Fungi
Indra Roux, Clara Woodcraft, Jinyu Hu, Rebecca Wolters, Cameron L. M. Gilchrist, and Yit-Heng Chooi
ACS Synthetic Biology 2020 9 (7), 1843-1854
DOI: 10.1021/acssynbio.0c00197

BGC0001668 enhancement

MIBiG ID
BGC0001668

Short summary
Add new publication.
Change name of compound.

New publication confirms the role of MicB (P450) in pathway
Name of compound for this entry is currently EQ-4, following the name in NPAtlas (discovered in another fungus), though it's known as microperfuranone in literature. This new paper names the product of MicA-B as dehydromicroperfuranone (not in NPAtlas)

Literature references
PMID: 32526136

Error with BGC0001018

Affected BGC
BGC0001018

Describe the error
I believe this BGC has been misannotated as a hybrid NRP/Polyketide. I don't see any evidence for PKS from the biosynthesis described in the paper (although the existence of hybrid mycrocystins are discussed)/ there are no PKS genes are in the cluster.

Error with BGC0001476

Affected BGC
BGC0001476

Describe the error
Wrong citation link, want either PMID: 29551347 or DOI: 10.1016/j.chembiol.2018.02.008

Error with BGC0000335

Affected BGC
BGC0000335

Describe the error
Misclassified as NRP. Stated in the abstract of the paper:

The gene cluster for cystomanamide biosynthesis was identified by gene disruption as PKS/NRPS hybrid

BGC0002009 enhancement

MIBiG ID
BGC0002009 (kanglemycin)

Short summary
Add the Antibacterial compound activity to kanglemycin.

Literature references

  • Mode of Action of Kanglemycin A, an Ansamycin Natural Product that Is Active against Rifampicin-Resistant Mycobacterium tuberculosis, Mosae et al. doi:10.1016/j.molcel.2018.08.028

(this is the paper supporting the BGC0002009 annotation, and it described kanglemycin as having an antibacterial activity against Mycobacterium tuberculosis).

Error with BGC0000553

Affected BGC
BGC0000553

Describe the error
Cluster is reported to be complete, but the cluster doesn't contain a LanKC enzyme, which is required for an AmfS-like class III lanthipeptide.

Invalid compound in BGC0000723

BGC0000723

The compound produced by this BGC should be validamycin A, not β-D-galactosylvalidoxylamine-A; the citation reports the discovery of validamycin A and does not mention the latter anywhere.

Error with BGC0001336

Affected BGC
BGC0001336

Describe the error
Formula of erdasporine A is invalid, the formula of erdasporine C is listed twice.

Error with BGC0001966

Affected BGC
BGC0001966

Describe the error
Perhaps should be re-classified as NRP/PKS hybrid (currently NRP-only)
From the paper:

We describe the NRPS-t1PKS cluster ‘BIIRfg’... which involves the formation of the lipid part by BIIRfg_PKS

Error with BGC0001497

Affected BGC
BGC0001497

Describe the error
Annotation is Polyketide. Should probably be NRP (see homolog cluster BGC0001442, which is correct).

Error with BGC0000535

Affected BGC
BGC0000535

Describe the error
Annotation in the "genes" tab lists genes twice. Once with the locations from the GenBank file, once as "external" with the manual annotations.

Error with BGC0000176

Affected BGC
BGC0000176

Describe the error
Classified as PKS/NRPS, but the structure doesn't seem to contain any NRPS part.

Error with BGC0001961

Affected BGC
BGC0001961

Describe the error
The reference DOI is 10.1039/C8SC05670, but should be 10.1039/C8SC05670F.

Thanks to Dr Nicola Convine from the Systems Support Team of the Royal Society of Chemistry for the initial report.

Error with DOI links for multiple BGCs

Affected BGC
BGC0001878
BGC0001882
BGC0001891
BGC0001896
BGC0001909
BGC0001913
BGC0001915
BGC0001926
BGC0001930
BGC0001938
BGC0001942
BGC0001946
BGC0001956
BGC0001957
BGC0001961
BGC0001967
BGC0001969
BGC0001984
BGC0001985
BGC0001986
BGC0001995
BGC0001997
BGC0001999
BGC0002011
BGC0002012
BGC0002018
BGC0002034
BGC0002036

Describe the error
I initially reported this problem in issue #21, but it's more pervasive: for many [all?] of the recently added clusters, DOI linkouts to publications are missing the final character. These hyperlinks are, of course, generating errors from doi.org.

There is an additional problem with BGC0001878: the bioarxiv link, but not the doi, contains the semantic versioning tag. (10.1101/445270v1 => 10.1101/445270).

I've fixed the BGCs listed above (PR incoming), but this may not be comprehensive.

Error with BGC0000341

BGC0000341

Hi there,

The structure for this cluster (encoding enduracidin production) is not correct. Everything else looks fine to me.

Best,
Logan

--

Logan W. MacIntyre
Postdoctoral Fellow
Laboratory of Genetically Encoded Small Molecules
The Rockefeller University

Error with BGC0001395

Affected BGC
BGC0001395

Describe the error
Structure in entry (and pubchem) are for teichoplanin, not teichomycin.

Error with fusarin BGCs

Affected BGC

Describe the error

Both BGCs are reported to produce a compound named fusarin of SMILES C/C=C(/C)\C=C(/C)\C=C(/C)\C=C\C=C(/C)\C(=O)[C@@]12[C@@H](O1)[C@](NC2=O)(CCO)O. However, the references for both clusters indicate they actually produce fusarin C, which SMILES is C/C=C(\C=C(/C)\C=C(/C)\C=C\C=C(/C)\C(=O)[C@@]12[C@@H](O1)[C@](NC2=O)(CCO)O)/C(=O)OC.

Error with BGC0001839

Affected BGC
BGC0001839

Describe the error
Cluster is missing many genes. See complete squalestatin cluster BGC0001339.
Publication is incorrect or incomplete; the only reference is related to BGC0001339 (which describes the cluster in Phoma sp and never mentions Aspergillus Z5)

BGC0001845 enhancement

MIBiG ID
BGC0001845

Short summary
Comment mentions a Nat. Prod. Commun. paper from 2013 that should be in the publications list

Literature references
PMID 24354182

Error with BGC0001125

Affected BGC
BGC0001125 OR
BGC0001951-BGC0001953

Describe the error
Puwainaphycin/Minutissamide clusters clearly have genes with KS domains. A 2014 Plos One paper classifies BGC0001125 as NRP/PKS. However, clusters BGC0001951-BGC0001953 are listed on Mibig as only NRP. A 2019 AEM paper discusses the PKS genes in the abstract but only uses lipopeptides classifier. This is a judgement call, but should be consistent.

Error with BGC0000932 and BGC0001362

Affected BGC
BGC0000932 and BGC0001362

Describe the error
double entry of the tropodithietic acid biosynthetic gene cluster from Phaeobacter inhibens DSM 17395; could the entries be merged as they contain different information?
Thanks, Eva

Error with BGC0001932

Affected BGC
BGC0001932

Describe the error
Duplicate of BGC0001662, just with a typo in the compound name.

Error with BGC0001310

Affected BGC
BGC0001310

Describe the error
Listed as a terpene, but is a PKS type III

Error with BGC0001982

Affected BGC
BGC0001982

Describe the error
This is an exact duplicate of BGC0001881

Error with BGC0001773

Affected BGC
Please give the BGC number of the affected cluster (e.g. BGC1234567).
BGC 0001773

Describe the error
A clear and concise description of what the bug is.
2 genes are missing from the cluster: selE and selDIII. Both are just beyond the current boundary of the cluster at selI. With these additions the pathway is complete.

Error with BGC0001659

Affected BGC
BGC0001659

Describe the error
NPAtlas link does not match compound when searching by name

Curacomycin BGCs have an invalid paper reference

Error with BGC0000315

Affected BGC
BGC0000315

Describe the error
In the gene annotation tab, the evidence for function always contains "activity assay" twice when the other evidence is "knock-out"

Error with BGC0000145

Affected BGC
BGC0000145

Describe the error
This probably should be reclassified as NRP/Polyketide hybrid, to be consistent with BGC0001041 (also a salinosporamide A producer)

BGC0000598 enhancement

MIBiG ID

BGC0000598

Short summary
Species for this BGC is currently listed as Streptomyces diastatochromogenes Tü6028. Should be Candidatus Entotheonella factor

Literature references
Please give the publications you base your additions on (DOI or PMID).
PMID: 24476823 DOI: 10.1038/nature12959

Error with BGC0001072

Affected BGC
BGC0001072

Describe the error
PKS module descriptions are incorrect.

Error with BGC0001415

Affected BGC
BGC0001415

Describe the error
Probably should be reannotated as hybrid NRP/PKS. From the abstract of the paper:

[Bioinformatics] predicted a hybrid non-ribosomal peptide synthetase-polyketide synthase (NRPS-PKS) assembly line

Error with BGC0001044

Affected BGC
BGC0001044

Describe the error
Classified as a Polyketide + NRPS, but no PKS genes present and the first reference only mentions a single ketide starter unit.

BGC0000147 enhancement

MIBiG ID
BGC0000147

Short summary
Based on structure and release type, the PKS subclass should be "macrolactone"

Error with BGC0002057

Affected BGC
BGC0002055-BGC0002057

Describe the error
I think there are annotation errors in the json files added today by @satriaphd (commit 41401a0)

  1. In all three, the referenced doi:10.1073/pnas.1919245117 doesn't exist.
  2. In BGC0002057, the NCBI loci BK012115.1 does not exist, and seems inconsistent with the loci of the other two records (JAAHTG010000029.1, JAAHTH010000242.1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.