Comments (10)
The main decision tree here is if we make the OpenMM
model
field very rich or if we separate it out into many different programs such as smirnoff-openmm
and gaff-openmm
. I am leaning a bit towards a single openmm
program that can dispatch to multiple typers and is largely transparent to the user. I would hazard most quantum chemist don't really want to know the details.
Supporting full force fields should be enabled as well for power users. A URL might not be a good way to go as a worker node is not guaranteed access to the internet. 60kb per result is likely fine for many use cases.
from qcengine.
@jchodera no need to apologize! This is extremely valuable. It's much easier to build something of value when there is a tangible need and urgency. I will be using your comment as a reference point while implementing this to make sure we can cover all (or at least most) of what's present here rather quickly.
I'm in favor of keeping the program
a single openmm
harness, and put the complexity internal to this. We can at least start down that road and see if our opinion changes as we go.
from qcengine.
We should also rope in @jaimergp, who is working on an experimental OpenMM CPU-only build for conda-forge
right now (openmm-cpu
) that we could use for this.
We don't yet have AmberTools on conda-forge
to bring in GAFF, but we can install that manually for now. We might want to rope in David Case and the AMBER folks to see if that might eventually be tractable---they've been working on a new "ambermini" that is a lightweight minimal Antechamber + LEaP + key toolchain set.
from qcengine.
Currently working on this one.
from qcengine.
@j-wags. Is there a Smirnoff toolkit piece we could borrow here?
Pretty excited about this. Should have quite a few applications such as starting guesses and conformer search.
from qcengine.
Apologies for hijacking this thread, but we're now in a position where we need to implement OpenMM-based energy and gradient evaluation for small molecule force fields for QCEngine for use in benchmark comparisons of SMIRNOFF and other small molecule force fields for the Open Force Field Initiative.
Our initial needs are to allow processing of small molecules through the same workflows we use for the Open Force Field Initiative, which are currently TorsionDriveDataset
and OptimizationDataset
.
Because we need a molecular topology to begin with, we would likely base this off of qcengine/programs/rdkit.py
, which creates an RDKit Chem.Mol
object that can then be fed to downstream parameterization engines.
We would like to support energy and gradient computation using OpenMM, using the following force fields:
- Open Force Field SMIRNOFF-format force fields, assigned via the Open Force Field Toolkit using the RDKit/AmberTools backend
- GAFF force fields (1.* and 2.*) assigned via Antechamber via AmberTools
Here's what we're thinking about how dispatch would work:
- OpenMM execution would be accessed with
program : openmm
(should this be something else more specific to reflect small molecule parameterization will occur after RDKit processing?) - The
model
dict would specify the force field asmethod
(e.g.gaff-1.8
,gaff-2.0
,smirnoff-1.1.0
) to be consistent with how therdkit
support specifiesmethod : uff
. gaff-*
support would be provided via AmberTools and a distributed set of the appropriate GAFF versions excised from their corresponding AmberTools versions, but we would use the latest AmberTools. We can support achargemodel
keyword that specifies the model to pass to Antechamber, and possibly an optionalcharges
array that overrides this.- If
method : smirnoff
is specified, aurl
key could specify a URL to a (hopefully immutable) force field specification file. Alternatively, anoffxml
key can specify the contents of an OFFXML file, but this will be pretty big (~60K). Charging schemes are specified in the OFFXML.
We could also potentially consider allowing the OpenMM ForceField processor to be used as well, but we would also need to specify an OpenMM ForceField XML file, which can get pretty big. Also, this is super fragile, and many molecules may fail to type and cause errors.
Since the QCEngine just computes a single energy/gradient pair per call, in order for AM1-BCC charging to not consume ~15 seconds/call, we would want to cache charges in memory or on disk.
OpenMM would be forced to use the Reference
or CPU
platforms. If there is a way to ask QCEngine how many CPU threads should be used, this could be passed on to the CPU platform.
from qcengine.
@dotsdl If you're moving ahead on an implementation, please go ahead! You can model it off of qcengine/programs/rdkit.py
to start with, and feel free to tap me to help. If you're not starting immediately, let me know.
What else needs to be specified? Should we create some test inputs as a way of aiding implementation and testing?
from qcengine.
Work-in-progress PR posted at #151. Comments welcome; there is plenty to hammer on yet, but this is probably about half of what's needed for basic functionality to fulfill needs laid out by @jchodera.
from qcengine.
This issue is now closed given merge of #151. I will create follow-on issues for each of the items raised that were declared out-of-scope for this PR.
from qcengine.
Thanks! The extra issues will be very helpful to ascertain the status and health of the harness.
from qcengine.
Related Issues (20)
- Uses deprecated yaml.load()
- Tests look for the program mopac when the mopac package installs the program mopac7 HOT 3
- qcengine/programs/tests/test_programs.py::test_mopac_task runs for 200+ minutes HOT 1
- The test qcengine/programs/tests/test_programs.py::test_psi4_task runs forever
- add r2scan-3c and b97-3c to gcp harness
- handle_output_metadata deletes .extras values if they were returned on a FailedOperation object HOT 1
- Importing qcengine breaks OpenMPI programs HOT 2
- Inconsistent number of threads used by the geometry optimizers HOT 5
- 'psi4 --version' times out HOT 1
- AttributeError: module 'configparser' has no attribute 'SafeConfigParser' when building on Fedora rawhide HOT 2
- Psi4 `_handle_output` stops retries
- Torsiondrive procedure still uses `local_options` HOT 1
- ncores and jobs_per_node
- openmm and pint 0.22 incompatible
- How to get CCSD(T) dipoles via qcengine? HOT 2
- enable rigid optimization with geometric HOT 1
- ESP grid files from psi4. HOT 2
- 2D torsion scan
- The output dictionary from TeraChem does not have the 'return_energy' key in the properties. HOT 1
- Incorrect basis set capitalization in CFOUR harness HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qcengine.