Code Monkey home page Code Monkey logo

Comments (5)

chrisiacovella avatar chrisiacovella commented on August 12, 2024 1

A proposed scheme:

<?xml version= "1.0"?>
<Foyer title="" website="" family="">
  <Creator last="" first=""/>
  <Source doi="" desc="" primary="" year="">
    <Author last="" first=""/>
    <Journal name="" volume="" number="" pages="" year="" title="" doi="" />
    <Note> </Note>
  </Source>
  <AdditionalNote>
  </AdditionalNote>
  <TestSuite>
    <Molecule name="" status=""/>
  </TestSuite>
</Foyer>

Tag by tag why I did what I did:

<Foyer title="" website="" family="">
  • title would be the title of the forcefield
  • website would be basically the link to the repository
  • family would be some way of trying to capture the type of forcefield, e.g., OPLS-AA, OPLS-UA, TraPPE, etc.
  <Creator last="" first""/>
  • What person or people actually made the force field file and set up the SMARTS strings and stuff. We can obviously have multiple entries here.
  <Source doi="" desc="" primary="" year="">
    <Author last="" first=""/>
    <Journal name="" volume="" number="" pages="" year="" title="" doi="" />
    <Note> </Note>
  </Source>
  • This is where we define the source of the parameters in the file. A few things to note here...if we are making a forcefield from a specific paper, we can mark it as the primary="True" where it would presumably be the most recent and most specific. If we were combining several sources to make a more robust forcefield, we can have multiple be set to primary="True".
  • We have an author tag here that corresponds to who actually made the forcefield (or if we are referencing a paper, who were the authors on the paper). Since a forcefield might not necessary have a journal article, it seemed best to keep this separate from the Journal tag...
  • If there is a journal article associated with the parameters, the basic info can be provided in the Journal tag. Note, this has a doi tag in it as well, since the could be referencing a repository with a zenodo doi (not necessarily a journal article). Note, I didn't put an author parameter list in here. We could easily include that using the bibtex format scheme or something, but I feel like it would probably be the same as the authors we list using the tag.
  • The Note tag is to give any specific information that might be relevant (especially if we only use 1 or 2 parameters from a source, or we are, e.g., using parameters for a different angle "CT-CT-F angles are assumed to be the same as CT-CT-OH in this manuscript"
  • If parameters come from a personal communication, we can note that here, and define the year and authors above.
<AdditionalNote>
  </AdditionalNote>
  • Anything else want to say about the forcefield.
  <TestSuite>
    <Molecule name="" status="">
  </TestSuite>

Presumably this could be generated automatically, but it would at least list what molecules were defined as tests.

I put together an example for the PFA forcefield. I think this will help us keep better track of not just source, but the relevance of those sources, and make it easier to parse this information. Again, we can use this xml file to generate the README.

<?xml version= "1.0"?>
<Foyer title="OPLS-AA parameters for perfluoroalkanes in Foyer format" website="https://github.com/chrisiacovella/oplsaa_perfluoroalkanes" family="OPLS-AA">
  <Creator last="Iacovella" first="C.R."/>
 
  <Source doi="10.1021/jp004071w" desc="All-atom OPLS parameters for perfluoralkanes" primary="True" year="2001">
    <Author last="Watkins" first="Edward K"/>
    <Author last="Jorgensen" first="William L"/>
     <Journal name="Journal of Computational Chemistry" volume="105" number="16" pages="4118--4125" year="2001" title="Perfluoroalkanes: Conformational analysis and liquid-state properties from ab initio and Monte Carlo calculations" doi="10.1021/jp004071w" />
    <Note>The forcefield here describes the general parameters for perfluoroalkanes; specific dihedrals exist for 4 and 5-mers in the original manuscript </Note>
  </Source>
 
  <Source doi="10.1002/jcc.540130806" desc="CT-F Bond Source" primary="" year="1992">
    <Author last="Gough" first="Craig A"/>
    <Author last="Debolt" first="Stephen E"/>
    <Author last="Kollman" first="Peter A"/>
    <Journal name="Journal of Computational Chemistry" volume="13" number="8" pages="963--970" year="1992" title="Derivation of fluorine and hydrogen atom parameters using liquid simulations" doi="10.1002/jcc.540130806" />
    <Note> CT-F bonds are taken from parameters in this manuscript, as described in Watkins and Jorgensen. </Note>
  </Source>
 
  <Source doi="10.1021/ja00124a002" desc="CT-F Bonds" primary="" year="1995">
    <Author last="Cornell" first="Wendy D"/>
    <Author last="Cieplak" first="Piotr"/>
    <Author last="Bayly" first="Christopher I"/>
    <Author last="Gould" first="Ian R"/>
    <Author last="Merz" first="Kenneth M"/>
    <Author last="Ferguson" first="David M"/>
    <Author last="Spellmeyer" first="David C"/>
    <Author last="Fox" first="Thomas"/>
    <Author last="Caldwell" first="James W"/>
    <Author last="Kollman" first="Peter A"/>
    <Journal name="Journal of the American Chemical Society" volume="117" number="19" pages="5179--5197" year="1995" title="A second generation force field for the simulation of proteins, nucleic acids, and organic molecules" doi="10.1021/ja00124a002" />
    <Note> F-CT-F angles come from this manuscript, as described in Watkins and Jorgensen. </Note>
    <Note> CT-CT-F angles are the same as CT-CT-OH and CT-CT-OS list in this manuscript, as described in Watkins and Jorgensen. </Note>
  </Source>
 
  <Source doi="10.1021/ja9621760" desc="All-atom OPLS parameters for alkanes" primary="" year="1996">
    <Author last="Jorgensen" first="William L"/>
    <Author last="Maxwell" first="David S"/>
    <Author last="Torado-Rives" first="Julian"/>
    <Journal name="Journal of the American Chemical Society" volume="118" number="45" pages="11225--11236" year="1996" title="Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids" doi="10.1021/ja9621760" />
    <Note> Bonds and angles for the CT-CT and CT-CT-CT are taken from this manuscript for alkanes, as described in Watkins and Jorgensen. </Note>
  </Source>
 
  <AdditionalNote> The backbone dihedral specifically references opls_962 (i.e. C-CF2-C) rather than only using the "CT" class; if only the "CT" class were used, this would create a conflict with alkane systems if the parameters were merged. </AdditionalNote>
  <AdditionalNote> The original parameters are defined as kcal/mol, this file uses kJ/mol; a conversion factor of 4.184 was used, consistent with OpenMM. </AdditionalNote>
  <AdditionalNote> PI is defined as 3.141592653589 for conversion to radians, consistent with OpenMM.</AdditionalNote>
  <AdditionalNote> Atom type names, e.g., opls_961, correspond to those defined in the OPLS forcefield itp file distributed with GROMACS. </AdditionalNote>
  <AdditionalNote> Conversion from OPLS-style dihedrals to RB follow the formulas detailed in the GROMACS manual. </AdditionalNote>
  <TestSuite>
    <Molecule name="CF4.mol2" status="PASS"/>
    <Molecule name="perfluoro-2-methylbutane.mol2" status="PASS"/>
    <Molecule name="perfluorohexane.mol2" status="PASS"/>
  </TestSuite>
</Foyer>

from forcefield_template.

chrisiacovella avatar chrisiacovella commented on August 12, 2024

It might be good to have the section on the test suite automatically generated when running the atomtyping.py test (the script to create the readme could run the test suite too). E.g., it would update which molecules are in the tests and if the were atom-typed correctly.

from forcefield_template.

chrisiacovella avatar chrisiacovella commented on August 12, 2024

In offline discussions, I think I'm going to try creating a minimal documentation file (e.g., that only requires doi and notes, rather than full references). The parsing code will automatically gather this info and write out both the Readme.md file and a "full" xml file.

We should also write out a bibtex file with the references included in the xml file.

from forcefield_template.

ctk3b avatar ctk3b commented on August 12, 2024

Yeah I think DOI's are totally sufficient here. Also worth checking out what the OpenForceField group is doing here: https://github.com/open-forcefield-group/openforcefield/blob/master/The-SMIRNOFF-force-field-format.md

They have some of the above features and I think it's worth trying to diverge as little as possible while we're still making design decisions.

from forcefield_template.

chrisiacovella avatar chrisiacovella commented on August 12, 2024

Well I guess I don't want the final readme or xml file to only have the DOIs; I can quickly see an author and a year and know what paper it is, but I'd have to do more work to actually lookup the DOI. Glancing at the specs in the Readme for that other file, we might have some additional stuff automatically written to a readme:

  • Some boilerplate about foyer and file format
  • scan the actual forcefield file and list the functional forms used for bonded/nonbonded parameters
    • could have some additional boiler plate information that is grab that defines how the format should actually be, like order of atom_ids in an angle.
  • Count the number of atom types/bonds/angles/dihedrals in the file (it seems like it would be useful to understand how expansive the force field file is
    • we could probably have a separate page generated (linked from the readme) that just lists the atom types, their description and their SMARTS string to make it a little more user friendly to see what is in the document.

While we could certainly get DOIs directly from the forcefield xml file, I think a separate xml document would be good since I think it is essential we add some notes associated with each paper, considering most forcefield parameter sets have been derived/aggregated in not so standard ways. Also it allows a clear explanation as to which parameters were chosen when there are duplicates (e.g., when merging two force field files).

In any case, working on updating the parsing code to automatically grab info from a doi, and then populate the relevant fields.

from forcefield_template.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.