leeping / forcebalance Goto Github PK
View Code? Open in Web Editor NEWSystematic force field optimization.
License: Other
Systematic force field optimization.
License: Other
#==================# #| ForceBalance |# #==================# If you're reading this, you are probably looking for help. :D The ForceBalance project website is located at https://simtk.org/home/forcebalance There is documentation available online in html at: http://leeping.github.io/forcebalance/doc/html/index.html You can also download the documentation in pdf format here: http://leeping.github.io/forcebalance/doc/ForceBalance-Manual.pdf #================================# #| Conda / pip installation |# #================================# As of version 1.7.4, ForceBalance is available as a package on the conda-forge channel. To install the package, make sure you are using an Anaconda/Miniconda Python distribution for Python versions 2.7, 3.5, 3.6, or 3.7, then run: `conda install --strict-channel-priority -c conda-forge forcebalance` This will install ForceBalance and all of the required dependencies. It will not install optional dependencies such as OpenMM, Gromacs, AMBER, Tinker, CCTools/Work Queue, or the Open Force Field toolkit. (Note: If you were installing ForceBalance from the omnia repository previously, you may need to clear your index cache using `conda clean -i`.) Similarly, to install from PyPI (Python Package Index), run the command: `pip install forcebalance` #=========================================# #| Building / installing from source |# #=========================================# To build the package: python setup.py build To install the package: python setup.py install To install the package locally (i.e. if you don't have root permissions), try the following: python setup.py install --user python setup.py install --prefix=$HOME/.local python setup.py install --prefix=/your/favorite/directory export PYTHONPATH=/your/favorite/directory Additionally, the Anaconda/Miniconda distributions are very helpful for setting up a local Python environment. To turn the package into a distributable .tar.bz2 file: python setup.py sdist #========================# #| Published Parameters |# #========================# If you are here to obtain data for a published force field (e.g. AMBER-FB15), you can find it in the Products folder. Please email me at leeping (at) ucdavis (dot) edu if you notice anything that should be added. #=======================# #| Citing ForceBalance |# #=======================# Published work that utilizes ForceBalance should include the following references: Wang L-P, Chen J and Van Voorhis T. "Systematic Parametrization of Polarizable Force Fields from Quantum Chemistry Data." J. Chem. Theory Comput. 2013, 9, 452-460. DOI: 10.1021/ct300826t Wang L-P, Martinez TJ and Pande VS. "Building Force Fields: An Automatic, Systematic, and Reproducible Approach." J. Phys. Chem. Lett. 2014, 5, 1885-1891. DOI: 10.1021/jz500737m #===========================# #| Funding Acknowledgement |# #===========================# The development of this code has been supported in part by the following grants and awards: NIH Grant R01 AI130684-02 ACS-PRF 58158-DNI6 NIH Grant U54 GM072970 Open Force Field Consortium
Hello,
I want to use particular targets in an optimization. To check the liquid.pdb, gas.pdb and file.in are correct I tried to use them to replicate the results this paper obtained for tip4p-fb http://dx.doi.org/10.1021/jz500737m However whilst the density and heat of vaporisation are close the static dielectric constant is around 5 instead of 77. Is there something else I should add to my targets to replicate this result?
This is my liquid.pdb:
CRYST1 20.000 20.000 20.000 90.00 90.00 90.00 P 1 1
ATOM 1 O HOH 1 9.457 18.861 18.736 1.00 0.00 QQQ
ATOM 2 H1 HOH 1 8.845 19.288 19.335 1.00 0.00 QQQ
ATOM 3 H2 HOH 1 10.058 19.557 18.471 1.00 0.00 QQQ
ATOM 4 M HOH 1 9.456 19.004 18.779 1.00 0.00 QQQ
.........
however with 233 molecules, I could email the rest.
This is the gas.pdb:
CRYST1 20.000 20.000 20.000 90.00 90.00 90.00 P 1 1
ATOM 1 O HOH 1 1.457 0.861 0.736 1.00 0.00 QQQ
ATOM 2 H1 HOH 1 0.845 1.288 1.335 1.00 0.00 QQQ
ATOM 3 H2 HOH 1 2.058 1.557 0.471 1.00 0.00 QQQ
ATOM 4 M HOH 1 1.456 1.004 0.779 1.00 0.00 QQQ
the data.csv:
T,P,MBAR,Rho,Rho_wt,Hvap,Hvap_wt,Eps0,Eps0_wt
298.15,1.0 atm,TRUE,997.045,1.0,43.989,1.0,78.409,1.0
the file.in:
$options
forcefield tip4pfb.xml
jobtype newton
$end
$target
type Liquid_OpenMM
name WATER
# Parameters for self-polarization correction of nonpolarizable water
self_pol_mu0 1.855
self_pol_alpha 1.470
$end
where tip4pfb.xml is the force field provided in simtk/openmm/app/data in openMM. Thank you in advance.
Alex
This is a way for me to list the features that I've added since the last ForceBalance release. It's really time to release version 1.3 by now, and after I implement Thermo I will release version 2.0.
Running ForceBalance with the --continue
option allows the optimization to pick up where it left off in the middle of an iteration, so that targets that already have been evaluated will have their data read back in. This is done using the read()
function which reads the objective function information as well as the indicator lines from indicate.log
. It will then proceed to evaluate the rest of the targets normally, and this includes calling get()
on condensed phase property targets that have more than 75% of their jobs completed.
If we attempt to write indicate.log
in a folder where we're calling get()
, ForceBalance assumes it's an error and throws an exception because get()
should not be called if we have already evaluated the objective function (rather read()
should have been called). However, the completed condensed phase property targets are an exception to this rule, so the exception should not be thrown.
For some reasons test on this pull request
http://not0rious.stanford.edu:8080/job/forcebalance-pull-request/66/
has been run for more than 1 day, I'm not sure if it's still running properly.
#68 is an update on Energy Estimator(Centroid estimator instead of Primitive Estimator, and )
Currently ForceBalance has a method called _exec()
that it uses for all system calls. It is a wrapper around functions in subprocess
that actually carry out the system call, but it uses select
to poll for whether the stdout and stderr file descriptors have data to read. This way, we are able to read from both streams while they are being generated without them being garbled.
I have noticed that when using this method in other contexts, sometimes I get the error file descriptor out of range
or something similar. This is due to a limitation in select
where there is a maximum number of file descriptors. One possibility is that I simply need to close the file descriptor after I'm done reading from it, but I haven't tested this (the error doesn't show up often since it needs to hit the limit). That would be a simple fix.
Another option would be to bundle Twisted, an event-driven programming library (http://twistedmatrix.com), with ForceBalance. It abstracts away low-level things like select
and should provide the functionality we need, and I think it properly accounts for the file descriptor issue. It is licensed with the MIT license (highly permissive) and its only dependency is zope.interface
. Having better asynchronous abilities in ForceBalance might enable other things as well, including maybe a way to fork local targets so they may be evaluated in parallel. However, introducing new dependencies should be thought out carefully.
Let me know what you think. This is medium to low priority.
Per discussion at https://openforcefieldgroup.slack.com/archives/C8P8JEE5T/p1545071217055600
We want to implement tests of ForceBalance's optimization process using liquid parameters. However, liquid simulations are not deterministic (even given same starting parameters, hardware differences may cause different results), and are prohibitively time-consuming.
When the property calculator API is finalized, we should store a small set of test trajectories, mirroring the simulations that would be run by ForceBalance in a specific optimization run. These simulation trajectories would be returned by a mock property-calculator interface. This setup would allow robust integration testing of ForceBalance's optimization using trajectory data.
ForceBalance carries out many calculations by using system calls to MD software such as Gromacs and TINKER. It will search for the executables in the directory specified by the tinkerpath
or gmxpath
option in the input file - but if this option is not present, it uses the which
command to figure out the location from the $PATH
environment variable.
The problem is that a single calculation may be carried out on many different machines. In the case of thermodynamic property fitting and "remote" targets, ForceBalance distributes the calculation using Work Queue and the MD software might not be installed in the same place on the remote computer. The possible solutions are:
Pass the tinkerpath
, gmxpath
, etc. options to all remote calculations and expect that all machines have the software installed in the same place.
Use the tinkerpath
, gmxpath
, etc. options only for local calculations and rely on $PATH
for remote calculations.
Eliminate all tinkerpath
and gmxpath
options and expect the user to always have these programs in $PATH
.
Some other clever solution that I haven't thought of.
I like option 3 because it reduces confusion for the user, but it only works if ForceBalance is compatible with default (unhacked) versions of the MD software. This is currently the case with GROMACS, OpenMM and TINKER, though I believe a small hack is still required for AMBER. (On the other hand, I don't know anyone who currently uses the AMBER interface.)
I am bringing this up now because until recently, ForceBalance has required a modified version of TINKER, but now it works with the official TINKER. We should be moving towards total compatibility toward regular versions of MD software.
Let me know what you think.
Currently, "remote" targets and "liquid" targets have the ability to read data from the disk. This was relatively easy to do because Work Queue required packaging up all of the data as files anyway. However, all targets should have this ability, and it's not hard (in fact, remote is a wrapper around any target). I'll make sure to implement this at some point.
It's not a really high priority, but I'd like to merge all (or most) of the functionality from Molecule.py into mdtraj. Do you know, off the top of your head, what features are missing? One that comes to my mind is support for the xyz file format, but I'm sure there are others.
This is a summary of my discussions with Erik (@ebran) today. Hopefully I'll get to this implementation within the next week or two. The overall goal is to merge the existing functionality of the Liquid
and Lipid
targets into the more general Thermo
target, which will greatly facilitate optimizations that use general combinations (or new types) of experimental thermodynamic data.
After this is done, the Liquid
and Lipid
targets will remain as long as there are users who are currently using them / want to use them, though it's my hope that I can someday remove them since all of their functionality will be in Thermo
.
The target is specified in the input file as follows:
$target
name MyThermoProperties
type Thermo_GMX
properties H_ads Rho
simulations
global
steps 20000
--
name Liquid
steps 10000
--
name Adsorbed
gmx_gro conf.gro
gmx_top topol.top
gmx_mdp grompp.mdp
gmx_ndx index.ndx
/simulations
$end
Note that the properties
field is a list of property names, which correspond to columns in expset.txt
(below) and implemented methods for calculating that property and its derivatives from the MD simulation. In general, properties are calculated using timeseries
of instantaneous observables that come out of simulations
. In other words, specifying a given property will cause particular simulations to run, and particular timeseries to be extracted from these simulations. We won't extract all possible timeseries because it's time-consuming and space intensive, instead opting to extract only those needed. Multiple properties can make use of the same timeseries in the same simulation.
The simulation settings are specified within the simulations
block which is a "section input type" - i.e. in contrast to most options, there will be a custom parser for the text contained between simulations
and /simulations
. Each simulation has a name which corresponds to the required simulations for the calculated properties - that is to say, if a given property requires a timeseries from the Interface
simulation, then the target must contain simulation settings for Interface
. Simple simulation settings - such as time step - can be entered here, and depending on which MD software is used, the user may provide files like grompp.mdp
and tinker.key
. The global
block contains default settings that are copied over to the individual simulations.
In the target folder, there will be a plain text file expset.txt
corresponding to the experimental data set. A separate chain of simulations will be run for each row of data in this file. This file will contain columns corresponding to the experimental data and the relevant variables for the simulation (e.g. temperature and pressure). I might want to change the default name to data.txt
because my other collaborators are used to data.csv
.
# Experimental data for liquid bromine.
Label Temp_K Density_kg/m^3 w Enthalpy_kJ/mol w Temperature_K Pressure_bar
298.15K 298.15 3102.8 1.0 29.96 1.0 298.15 1.01325
# Variables: Denominators and weights for quantities
Denoms = 30 0.3
Weights = 1.0 1.0
Here's how the calculation will work: The optimizer first provides the force field parameters, and the target's stage()
function will execute the simulations of thermodynamic properties, 1+ chain of simulations per row of experimental data. These can either run locally or through the Work Queue. The executed script, md-chain.py
, takes as input the simulation settings (including initial conditions) that are required for calculating the properties. The simulations are executed sequentially, and the time series of instantaneous observables (e.g. potentials, volumes) are written to disk. Each time series will be keyed with the simulation name and observable name. As a sanity check / indicator, md-chain.py
will perform a standalone calculation of properties corresponding to the available time series.
After md-chain.py
returns the time series, they will be used to calculate the thermodynamic properties themselves in the target's get()
method. The current thought is to create an object for each row of experimental data, which is initialized using the time series and provides methods to calculate each property and their gradients. The main advantage of Thermo is that it allows for the implementation of any thermodynamic property as a method that operates on time series of observables from simulation chains.
Here the time series of observables may be preprocessed using MBAR which increases the amount of data available for each row of experimental data, as long as the only variables are temperature and pressure. Also here would be the entry point for evaluating the standard error of the objective function via bootstrapping.
After the property values and their gradients are calculated, the objective function, gradient and Hessian contributions are easily computed and returned to Objective
.
Parser for input options. This should be easy.
Need to create a class corresponding to "row of experimental data", which contains the time series of observables / experimental data and implements the methods for calculating properties and their derivatives. This code should be straightforward.
I don't really know what to name this class, because DataSet
sounds more like the entire experimental data set, and DataPoint
makes it sound like just one property. How about ThermoProperties
?
Need to modify md-chain.py
so that it automatically knows how to set up and execute the simulation chain and extract the relevant time series. This should not be too hard except for these potential challenges:
.mdp
, .top
, .key
etc.) inside the target folder, under a subfolder corresponding to the row label. Settings in the input file (e.g. number of MD steps) and variables in expset.txt
will override the options in the run parameter files. If the run parameter files do not contain required options, ForceBalance will provide some defaults (e.g. reasonable Ewald summation settings if none provided). This is how I'm currently doing it, but I imagine md-chain.py
needs to be especially careful with handling multiple layers of possible input. (Extra note: The input can be simplified greatly if the GROMACS, TINKER etc. input files are required to have the same base name as the simulation name.)md-chain.py
can calculate some properties on its own to provide a sanity check for the user (I've found this to be helpful in npt.py
). Since md-chain.py
already knows the simulations to run and which time series to extract, it should be able to look up which properties it is able to calculate on its own.Need to implement Work Queue functionality, MBAR and error estimation via bootstrap into thermo.py
. Also it needs to make extensive use of the ThermoProperties
objects. We should incorporate the backtracking feature in Liquid
that reuses experimental data when an optimization step is rejected. We should also have the ability to execute multiple copies of a simulation chain for each ThermoProperties
object in order to take advantage of parallelism. I think these are in order of increasing difficulty, and since it ties everything together we should implement it last.
thermo.py
needs to parse expset.txt
(or data.txt
depending on what we name it) intelligently. In the beginning this should be easy, because many properties are simply floating point numbers. However, in the case of deuterium order parameters, the property is a medium-sized array and we need an intuitive syntax for parsing array input. I think Keri (@kmckiern) already has code for this.
Engine objects need to have a more general way to run molecular dynamics, and implement methods for extracting time series. This should be easy, though I would like to rewrite the molecular_dynamics()
method while preserving the compatibility with the Liquid
and Lipid
targets.
Thinking out loud here..
Currently a ForceBalance calculation reads from the input file a set of general options for the calculation and one set of target options for each target. Options that are not set in the input file are set to their default values as declared in the parser. This is done by first copying the default general options / target options and then inserting user-provided values. Simple right?
Wrong; it turns out we often want to define default values at the target level. That is to say, different targets might want to have different default values for a given option. The default values for XYZ coordinates is an example.
I have tried (unsuccessfully) to implement this. The set_option
method defined for all ForceBalance classes allows you to enter a default value. This value is used if the options dictionaries (passed into the target initialization) do not contain the key in question. However, since the options dictionaries always contain the default values from parser.py, the target-level default values are never used, so we don't have the fine-grained control over target options that we desire.
I think there ought to be a hierarchy of ways to set target options:
parser.py
are lowest priority.One way to implement this would be to simply not copy the global defaults when initially creating the global option / target option dictionaries, so they are empty except for the user-provided values when passed into the class initialization. This could break a few existing things, but thank goodness for unit tests.
Also: Certain True/False options have previously defaulted to False, but the proposed change would make it so that the option doesn't exist at all. This means several places in the code I have to do this clumsiness:
- if not options['continue']:
+ if ('continue' not in options) or (not options['continue']):
It's not too clumsy but somehow it seems wrong.
Let me know what you think.
The current ForceBalance code on master seems to be requiring lxml
, networkx
, pymbar
, six
, and future
(for python 2 users) that could be added to setup.py
. Is there a reason they are not specified in setup.py
?
Hi @leeping, I hope all is well! I want to try out the BasinHopping optimization routine in FB with the lammpsio.py interface I have added. I receive the following error when I try to run it:
"""
indicate.log should not exist yet in this directory: /home/jnapoli/test_reax_fb/regularization_tests/L2_1.0_anneal/opt_reax.tmp/cluster-01/iter_0000
"""
This is triggered in target.py (https://github.com/leeping/forcebalance/blob/master/src/target.py#L508). I think this means that the program should not be in the iter_0000 directory, but I am not very sure. I have also tried BasinHopping with an AbInitio_OPENMM target and receive the same error.
-Joe
I'm wondering the main difference(extra calculations, work, etc.) between running single point calculation on ForceBalance and running an OpenMM simulation.
I've tried to run a 10000 steps(5ps) single point calculation on ForceBalance using qtip4pf water model with RPMD(2 beads), and the jobs take more than half hour to finish(The simulation itself should take 20~30s).
I'd like to bring up an idea that's been rolling around in my head, which if successfully implemented should greatly improve the reusability of the code. However, I'm still puzzling over some design choices, so this is more of a future improvement.
Currently, "Target" classes are implemented in the ForceBalance code using the following inheritance structure:
Target (base class)
Liquid
Liquid_OpenMM
Liquid_GMX
BindingEnergy
BindingEnergy_OpenMM
BindingEnergy_TINKER
AbInitio
AbInitio_OpenMM
AbInitio_GMX
AbInitio_TINKER
The derived classes Liquid
and BindingEnergy
correspond to the physical quantities that we would like to fit using our force field. They contain code for reading the reference data files and calculating the objective function from the difference of the reference quantity and the simulated quantity. Since they don't evaluate the simulated quantities, these classes are independent of any MD software that is installed.
The next layer of derived classes Liquid_GMX
are like an implementation layer. They contain code to execute the external MD code (or make API calls in the case of OpenMM). For instance, BindingEnergy_TINKER contains calls to the TINKER programs optimize
and analyze
, in order to optimize the geometries of complexes and calculate their energies with respect to the isolated monomers.
The problem that I want to address is the duplication of code in the implementation layer. For instance, both the classes Liquid_OpenMM
and AbInitio_OpenMM
contain calls to OpenMM to evaluate the potential energy on a set of structures. It might be possible to restructure the code such that performing a given operation (e.g. evaluating the potential energy on a set of structures) with a given MD code (e.g. OpenMM) only needs to be implemented once.
I have heard bad things about multiple inheritance, so I think a "composition" framework might be a good design. This would introduce a new class that represents the MD code, which I will call the Engine
.
The Engine object should contain methods that carry out useful operations in the MD code, which need to be callable without knowledge of which code we're actually using. An early example is already implemented in https://github.com/leeping/forcebalance/blob/master/src/data/npt.py lines 181 through 803.
Once we have these Engine classes, then I imagine that we will no longer need two layers of derived classes for Target classes. Instead, the Target objects will contain Engine objects and call them as needed.
Two outstanding questions / related issues is whether the Engine class should contain any data, and how to handle file dependencies. For instance, in order to minimize the energy of a structure in any MD code (e.g. this would look like TinkerEngine.minimizeEnergy()
in the proposed code), one needs to provide the structure. We can either store the structure as data in the Engine object (e.g. as a Molecule
object), or the method needs to take the structure as an input argument.
The benefit of storing the structures is that we can count on the Molecule class to write the correct file format for any MD code to use. But where does it make sense to store the structures - in the Target object or the Engine object? There is also a cost associated with storing the structures because they need to be in memory over the course of the calculation, whereas many MD codes simply iterate over the files. For certain targets, it might not be possible to store the structures in memory (e.g. if they correspond to long MD runs.) In these instances, does it make more sense to remember just the file names?
Let me know what you think. I think that once the data storage question is cleared up, I can begin implementing this.
Attempt to edit the function get_multipoles in openmmio.py by looping over all copies of system in RPMD simulation.
I notice getSystemMultipoleMoments function are used to calculate for AmoebaMultipoleForce, but not for Coulomb's force. I'm wondering if the Amoeba Multipole can be calculated using ordinary method from coordinate, because the argument context in getSystemMultipoleMoments cannot possible get positions for all copies of system.
This is a minor issue in the refactored _exec() method. The new _exec() has the ability to simultaneously follow stdout / stderr streams and print them to the terminal, and store the stream in a list, and print it to a file.
However, many programs like Gromacs will print out progress reports with carriage returns, something like this:
"25% done\r50% done\r75% done\r100%done\n"
If I read from this stream using the readline() function, it will only print out the entire string when the final newline character appears. On the other hand, if I use the read() function and read in some amount of bytes, then the stdout and stderr streams can get garbled.
The ideal solution would be for readline() to have the ability to return strings that end in \r and \n, but I don't know how to do this. The current workaround is to have an option to set "universal_newlines=True", but that can print out many new lines, which isn't really desirable. The main goal should be to reproduce the output of the running program.
Okay, this is more of a question than a bug report. And I'm sorry for asking it here, I couldn't find the answer by reading the code/docs.
I am using the study #7
(007_tip4p_STP) with a few modifications.
AbInitio
target.Using that, I've tried reparametrizing the charge for a simple water model and everything seems to be working perfectly and optimization is done after a couple of iterations.
I noticed that a mvals
is written at every iteration which contains the gradients, and that is used as a direction for the next iterations.
Now, I have modified the OpenMM's NonBonded code, introducing some new parameters (c6, c8, c10, c12). ForceBalance still does a perfect job reparameterizing the charges with the modified OpenMM code and the gradients are not zero. But when I try to reparametrize the new parameter, it stops after the first iteration and I realized the reason is that the gradient is 0.0
in the mvals file. So my question is how are the gradients determined? Why do I get zero gradients?
Lately I've noticed an stochastic failure of one of my unit tests. I think this is due to the TINKER version update and their geometry optimization code being changed, but I can't prove it.
> test_engine.TestAmber99SB.test_optimized_geometries
>
> Traceback (most recent call last):
> File "/home/travis/miniconda/envs/forcebalance_env/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
> File "test/test_engine.py", line 139, in test_optimized_geometries
> File "/home/travis/miniconda/envs/forcebalance_env/lib/python2.7/unittest/case.py", line 561, in assertAlmostEqual
> raise self.failureException(msg)
> AssertionError: -61.8518871 != -61.820866873804967 within 0.001 delta : TINKER optimized energies do not match the reference```
Try to add an extra function
def CopyCustomBondParameters(src, dest):
dest.setEnergyFunction(src.getEnergyFunction())
for i in range(src.getNumGlobalParameters()):
dest.setGlobalParameterName(i,_src.GlobalParameterName(i))
for i in range(src.getNumGlobalParameters()):
dest.setGlobalParameterDefaultValue(i,_src.getGlobalParameterDefaultValue(i))
for i in range(src.getNumPerBondParameters()):
dest.setPerBondParameterName(i,_src.getPerBondParameterName(i))
for i in range(src.getNumPerBondParameters()):
dest.setBondParameters(i,_src.getBondParameters(i))
and modified CopySystemParameter, by adding 'CustomBondForce' into Copiers,
Make a test run on qtip4pf water optimization(where the OH bond energy is 4th deg. polynomial), get the following error message,
return _openmm.CustomBondForce_setPerBondParameterName(self, *args)
TypeError: CustomBondForce_setPerBondParameterName() takes exactly 3 arguments (4 given)
when I substitute
dest.setPerBondParameterName(i,_src.getPerBondParameterName(i)) by
dest.setPerBondParameterName(_src.getPerBondParameterName(i)) I get the message says I only input 2 arguments, but i is just integer.
Since we have made the transition to logging, the "nifty.py" file is no longer completely standalone (i.e. it now depends on some aspects of the ForceBalance package through init). Since the "nifty.py" file is used in both the forcebalance and nanoreactor packages, the files are beginning to diverge in the two packages.
Because of this, I need to make the "nifty.py" file standalone again.
I will create a default logger if the package name is not "forcebalance", and I might revert back to using global variables for the WORK_QUEUE and WQIDS objects. It would be best not to go back to global variables, so if anyone has ideas please let me know.
When installing ForceBalance on Mac OS, the following error occurs when attempting to build the contact extension:
clang -bundle -undefined dynamic_lookup -Wl,-F. -arch i386 -arch x86_64 build/temp.macosx-10.8-intel-2.7/ext/contact/contact.o build/temp.macosx-10.8-intel-2.7/ext/contact/contact_wrap.o -o build/lib.macosx-10.8-intel-2.7/forcebalance/_contact_wrap.so -lgomp
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: command 'clang' failed with exit status 1
In order to get it to compile, I changed the compiler and linker arguments based on the operating system, so it no longer looks for "-lgomp" when it compiles. It shouldn't matter so much whether contact is compiled with OpenMP libraries or not, since this part is only used in building the molecular connectivities. This is a temporary workaround however, and it would be best if the linker could actually find where the libraries are.
Currently, FB includes kinetic energy terms in the gas and liquid phases when computing the heat of vaporization. I am pretty sure these terms are supposed to analytically cancel out in classical mechanics, so these terms should also be removed from the calculations.
This issue may also extend to the calculation of thermal expansion coefficients and heat capacities, where the kinetic energy term shows up in the expressions but we may have analytic expressions for them. We should think about whether the proposed fix for โHvap should also extend to these other properties.
Email snippet:
Here is an attempt on force field optimization to general
experimental target data (quantities
) using ForceBalance and
Gromacs. The new target is calledthermo.py
and quantities are
defined inquantity.py
. I have taken bits and pieces from
liquid.py
andnpt.py
to put it all together but stripped some
functionality, most notably the MBAR stuff and the workqueue.I think the code looks great.
- In terms of organization, "quantity.py" looks engine-independent;
do you think it's worthwhile to move the Gromacs-specific parts of
quantity.py into gmxio.py as functions or engine methods? You have
a bit of Gromacs-specific code for heat of vaporization - but for
other properties such as the thermal expansion coefficient,
isothermal compressibility etc. it could come up again. More
generally, the extraction of time series from the simulation
(densities, potential energies) is engine-dependent, but their
processing in quantity.py is engine-independent.It's definitely worthwhile and is probably the best solution in the
end. My problem was that I am not very familiar with the other engines
(OPENMM and TINKER) than Gromacs.The point is essentially what you stress above:
- Run a series of MD simulations and extract the results as
time-series. (engine-dependent)- Process time-series and combine them into some quantity for
comparison with reference data (engine-independent)It is not always known what time-series are needed to calculate an
arbitrary quantity, but for most cases common properties will be enough
(energies, volumes, pressures,...).You're right, I haven't completely figured this out. There aren't that many scalar time series - g_energy covers most of them, and in most cases it's ensemble or energy-related (instantaneous temperature, pressure, volume, potential, energy components.) Some other time series (for example, order parameters) depend on the geometry, and the user should implement code to calculate it.
I can see a conceptual division between the "time-series" or "instantaneous observable" described above, and your "quantity" class which is a statistical property. The user might have to make this distinction when implementing their properties.
Hi,
I have implemented a custom non bonded force into openMM and am now trying to optimize it with ForceBalance. However when run with my custom forcefeild and targets I get the error:
File "/usr/lib64/python-2.7/sitepackages/numpy/core/numeric.py", line 183, in ones
a = empty(shape, dtype, order)
TypeError : an integer is required
This is the xml i'm using:
<ForceField>
<AtomTypes>
<Type mass="15.99943" element="O" class="OW" name="tip3p-O"/>
<Type mass="1.007947" element="H" class="HW" name="tip3p-H"/>
</AtomTypes>
<Residues>
<Residue name="HOH">
<Atom name="O" type="tip3p-O"/>
<Atom name="H1" type="tip3p-H"/>
<Atom name="H2" type="tip3p-H"/>
<Bond to="1" from="0"/>
<Bond to="2" from="0"/>
</Residue>
</Residues>
<HarmonicBondForce>
<Bond k="462750.4" length="0.09572" class2="HW" class1="OW"/>
</HarmonicBondForce>
<HarmonicAngleForce>
<Angle k="836.8" class2="OW" class1="HW" angle="1.82421813418" class3="HW"/>
</HarmonicAngleForce>
<NonbondedForce lj14scale="0.5" coulomb14scale="0.833333">
<Atom type="tip3p-O" epsilon="0" sigma="0" charge="-0.834"/>
<Atom type="tip3p-H" epsilon="0" sigma="0" charge="0.417"/>
</NonbondedForce>
<BuckinghamForce bondCutoff="3" energy="B1*B2*(eps/(1-(6/gamma)))*((6/gamma)*exp(gamma*(1-r/sig))-((sig/r)^6)); sig=sqrt(sig1*sig2); gamma=sqrt(gamma1*gamma2); eps=sqrt(eps1*eps2)">
<PerParticleParameter name="B"/>
<PerParticleParameter name="eps"/>
<PerParticleParameter name="sig"/>
<PerParticleParameter name="gamma"/>
<Atom type="tip3p-O" parameterize="eps,sig,gamma" gamma="12.75" sig="3.55e-1" eps="0.65" B="1"/>
<Atom type="tip3p-H" gamma="1" sig="1" eps="1" B="0"/>
</BuckinghamForce>
</ForceField>
my only targets are gas and liquid pdb from forcebalance/studies/005_iamoeba/targets and a data.csv with:
T P MBAR Rho Rho_w Hvap Hvap_wt
298.15 1.0 atm TRUE 997.045 1 43.989 1
I'm not sure what is causing this error other examples in the studies directory seem to work fine, any help would be greatly appreciated.
Many Thanks,
Alex
- In your code, the quantities are calculated in "md_chain.py", and
"thermo.py" reads them in after the calculation is complete. This
is different from how "liquid.py" and "npt.py" do the calculation;
"npt.py" saves the time series of densities, energies etc. and the
quantities are calculated in "liquid.py". I see the merit of this
approach - it is compact and straightforward to implement. However,
it doesn't allow us to use the time series from multiple state
points to calculate a quantity (this is useful for MBAR, I'm not sure
what else in the future.)Because of this, I would prefer for "md_result.p" to contain time
series of simulation variables (e.g. densities, energies, and energy
derivatives for each simulation) rather than the quantities
themselves, and then "thermo.py" would calculate the quantities.
Instead of specifying quantities on the command line, "thermo.py"
would tell "md-chain.py" which simulations to run (e.g. liquid, gas,
interface, or something.) What do you think?Sounds good! The idea with md-chain.py is that a number of different
simulations (md chain
) are required to calculate a single quantity,
as in e.g., enthalpy differences (solvation enthalpy, adsorption
enthalpy, vaporization enthalpy).If you can have a simple way to specify the chain, then md-chain.py
could return a matrix of time-series where each column corresponds to
one simulation in the chain.I agree here - the simulations in the chain are separate from the quantities. A single quantity might use data from multiple simulations, and a simulation may contribute data to multiple quantities. How about we specify the simulation names in the arguments to md-chain.py ("liquid", "gas")? The quantities need to know the names of simulations that need to be run, and all of the simulations will be sent to md-chain.py .
Hi,
I am running a ForceBalance optimization with 369 tip4p water molecules with a custom non-bonded force and:
liquid_eq_steps 50000
liquid_md_steps 500000
gas_eq_steps 50000
gas_md_steps 500000
liquid_timestep 2.0
liquid_interval 0.2
However it takes around 24 hours to finish a step. Does force balance support parallelization to help reduce this time? If not is there anything else I could do to speed this up?
Many Thanks,
Alex
For future projects - especially the protein force field - it will become essential to have the ability to evaluate targets remotely.
For example, we might have 441 dihedral energy scan targets, where each target is evaluating the potential energy over a dihedral energy scan of a tripeptide - e.g. ACE-X-Y-NME. The reason for the "441" number is because there are 20 amino acids to consider, plus I'm looking at the HIE and HID protonation states. There might be even more if we consider the protonation states of the other side chains.
The reasons for wanting to run these calculations remotely are:
The calculations are computationally intensive. For a tripeptide, we need to do at least three two-dimensional scans; if we choose a grid spacing of 15 degrees, it would amount to 1875 calculations for a single-point evaluation, and ~200 times as many (equal to the number of parameters times two) for an analytic derivative of the objective function. If we have 441 tripeptides then this number goes into the hundreds of millions.
We cannot store a large number of CUDA contexts on a single host. Each GPU has an upper limit of around 20 CUDA contexts, and each target (i.e. each tripeptide) contains one CUDA context.
The calculations are trivially parallel. The targets do not depend on one another, so we should be able to evaluate them independently in order to maximize our utilization of the available resources.
Here are some of my early thoughts for the design.
We first run the main ForceBalance calculation on the local machine. The ForceBalance calculation will contain a "Work Queue object", which opens a port on the local machine and listens for connecting "workers". Next we submit a number of "worker" processes on the remote machines, and tell them the hostname and port where they need to connect. Each worker process will connect to the main ForceBalance calculation running on the local machine, receive the command and data needed to evaluate a component of the calculation, and then send the result back.
A few decisions need to be made regarding the interface:
Anyway, these are my thoughts and I'd like to hear your opinions.
Right now, parameter names like VDWS and VDWT (these are usually sigma and epsilon) are shared in AMBER .frcmod and GMX .itp formats. These parameters also have hard-coded rescaling factors in forcefield.py. The code is:
elif match('BONDSB|UREY_BRADLEYB|MORSEB|VDWS|VPAIRS|VSITE|VDW_BHAMA|VPAIR_BHAMA',termtype):
# nm to atomic unit
rsfactors[termtype] = 0.05291772
....
elif match('VDWT|VDW_BHAMB|VPAIR_BHAMB',termtype):
# VdW well depth; using kT. This was a tough one because the energy scale is so darn small.
rsfactors[termtype] = kb*Temperature
The problem is that these are GROMACS units, so the default rescaling factors are wrong when using AMBER targets. This can be worked around by adding a priors
section in the FB input file where AMBER targets are used, so that the units are converted to kcal/mol:
priors
VDWS : 5.29177e-01
VDWT : 5.92481e-01
/priors
At the same time, forcefield
is largely independent of target
/ engine
so I don't think forcefield
should figure out what engine is being used in the targets. For example we could also mix force field formats and engines in a single job, which was done for iAMOEBA.
This is part of a longer-term issue regarding the need for a standard system of parameter names and units. This would break backward compatibility and should be a part of release 2.0 .
For now I edit the energy estimator part such that it includes 3 different quantities.
cent_kinetic: virial kinetic energy calculated by centroid estimator.
pri_kinetic: virial kinetic energy calculated by primitive estimator.
kinetic: the "kinetic energy" in classical sense, that is, if we divided this by degree of freedom and some constants it should average to thermostat temperature.
Question,
Is there any way to record these quantities during the MD simulation, and output the average, standard deviation, and perhaps it's derivative with respect to force field parameter? I know most of them are calculated in liquid.py but that's simulation-engine independent.
Right now ForceBalance
is a shell script that calls ForceBalance.py
, and also does the following:
ForceBalance.py
has a nonzero exit status or if the error file contains any messages, it shows the last few lines of the error file to indicate a potential error message.I'm thinking of merging the shell script entirely into the Python script, so that ForceBalance
and ForceBalance.py
are the same file. This would simplify things from the user's perspective, but it might also require a fair bit of stream handling to replicate all the functionality of the shell script.
The different points in the reference data set do not have to correspond to different values of the ensemble parameters, e.g., for enthalpy of adsorption you can have \Delta h_vap (n_H2O), where each point (each value of n_H2O) is at the same temperature, T, and pressure, p. In this sense T and p, are the two only
special
variables (and \mu in a GC ensemble) that need to be specified for each point in the reference data. T,p, and possibly \mu are fixed control parameters that define the ensemble.That's a good point. Going back to your original application, the points in the reference data set might have the same ensemble parameters but different topologies.
Thus far it's been convenient for me to identify a point in the reference data set using T and p, but that is tied to my previous applications (physical properties of water at different temperatures and pressures).
I think it's possible to have it both ways. How about having an optional "name" column in expset.txt, so that the user may name directories as they please - but if the column is missing, then the folders are numbered numerically by default?
I wonder if there is any fork that already has python 3.6 support for forcebalance,
Like here: https://www.digitalocean.com/community/tutorials/how-to-port-python-2-code-to-python-3
Or maybe its not worthwhile and rather use conda and python 2.7?
Tobias
The proposed feature would provide an option to run multiple copies of simulations corresponding to a single experimental data point in Liquid
, Lipid
and Thermo
targets.
The idea is really simple - by running multiple simulations that only differ by their initial conditions, we should be able to trivially parallelize the accurate estimation of properties and their derivatives as long as resources are available.
In terms of organization:
data.csv
or expset.txt
.As always, please let me know what you think.
Currently the molecular_dynamics() method (implemented by me) returns a few specific pieces of information, such as the time series of energies, densities and so on. This wasn't general enough for the Thermo target so md() was added by Erik. At some point I would like to merge the two, and create a more general method for running molecular dynamics simulations + returning properties.
This thread discusses the possibility of using MDTraj to load and process trajectories in ForceBalance.
Currently I have a standalone file molecule.py
, containing the Molecule
object that performs all of my trajectory processing and file conversion. It is easily editable for random tasks that crop up in my research, and it works in the way I intuitively expect it to - thus it's an essential piece of code for my research. On the other hand, MDTraj offers excellent things such as:
I think one possible solution is to have MDTraj as an optional dependency of molecule.py
. It will work something like this:
When I create a Molecule
object, it reads in the trajectory with MDTraj if possible, and it will then contain a Trajectory
object.
Certain operations will check for the existence of the Trajectory
object using if statements.
Ideally the number of times I do (2) will be on the low side, because having too many if statements could become unmanageable. This is partially because I'm trying to have it both ways - getting the improved performance and features of MDTraj, while keeping my ability to rapidly prototype and hack molecule.py
. I think it's a reasonable compromise.
Additionally this will provide a route towards tighter integration down the road if desirable. molecule.py
obviously has some overlap with MDTraj, and as time goes on I can start to implement the nonoverlapping functionality from molecule.py
into MDTraj.
Description:
When parameterizing water model using Liquid targets, OpenMM engine gives error when loading liquid.pdb
:
ValueError: Residue 4 (HOH) does not match any template defined by the ForceField.
To reproduce error:
Decompress zip file, go into the folder openmm_residue_match_issue
, run python npt.py openmm 273.15 1.00
. Will call this folder "debug folder" below.
Initial investigation on source of error:
3 versions of liquid.pdb
are compared.
/forcebalance/studies/005_iamoeba/targets/Liquid/liquid.pdb
, renamed to orig_liquid.pdb
in debug folder. Using this file as liquid.pdb
will triggle residue matching error from Residue 1.backup.pdb
in debug folder, this version of liquid.pdb
uses HETATM
entry and deleted CONECT
for water, which I think is not the cause of error. Using this file as liquid.pdb
will triggle residue matching error from Residue 1.liquid.pdb
in the debug folder. Here the residue match error occurs at the 4th residue. Compare it to the 3rd residue, the only difference is the atomnames are changed, from O H1 H2
to O1 H2 H3
, and despite the provided CONECT
record, the residue matching error is still triggered.An walk around of this error would be to avoid the atomnames O1 H2 H3
for water residue in PDB file. Further investigation is needed to fully understand this error. This error did not occur when I used previous version of OpenMM (7.0) but happens when using the current version OpenMM 7.2
I was doing some development on my laptop which has a different environment from my workstation (Laptop (Mac OS 10.9) uses the Anaconda distribution whereas my workstation (Ubuntu 12.04 LTS) uses normal Python). On my laptop the extra logging messages are being printed to the stderr stream, which is confusing:
Does Anaconda modify the behavior of the logger in some way? (In general I don't know enough about how loggers work to troubleshoot them effectively.)
INFO:forcebalance.thermo:comm "# Experimental data for liquid, bromine."
INFO:forcebalance.thermo:empt
INFO:forcebalance.thermo:head Temp (K) Density (kg/m^3) w Hvap (kJ/mol) w Pressure (bar)
INFO:forcebalance.thermo:data 298.15 3102.8 1 29.96 1 1.01325
Suggestion from Han:
Currently ForceBalance uses a global variable which represents which iteration it's on. It would be good if this global was removed, but this is nontrivial. The ITERATION
variable is often used to determine which folder files are written to / read from, and it would greatly increase the flexibility of functions if they instead took the directory name instead.
Current documentation in .pdf (http://leeping.github.io/forcebalance/doc/ForceBalance-Manual.pdf) and on html (http://leeping.github.io/forcebalance/doc/html/installation.html) says that force balance requires python version 2.7, but should be updated to using python 3.6.
Hi Arthur,
Tonight I was using _exec() in an external script and I noticed it wasn't generating output anymore. I could add a RawStreamHandler to the logger in my external script, but this is a bit cumbersome. Perhaps the default behavior of a logger should be to have a RawStreamHandler to stdout. Maybe something like this at the top of each module file?
from forcebalance.output import getLogger, RawStreamHandler, INFO
logger = getLogger(__name__)
if len(logger.handlers) == 0:
logger.addHandler(RawStreamHandler(sys.stdout))
logger.setLevel(INFO)
@kmckiern @ahvigil @jisraeli @yudongqiu
ForceBalance is currently licensed under GPL v2, and I'd like to change it to the 3-clause BSD license. My understanding is that the new license would make it easier for folks in industry to collaborate and use our code.
Below is a copy of the filled proposed license, where I've pasted the BSD 3-clause license template and filled in the authors and contributors. Please let me know if you have any suggestions or concerns. My plan is to finalize this change in 1-2 weeks.
Copyright 2011-2015 Stanford University and the Authors
Copyright 2015-2018 Regents of the University of California and the Authors
Developed by: Lee-Ping Wang
University of California, Davis
http://www.ucdavis.edu
Contributors: Keri A. McKiernan, Arthur Vigil, Yudong Qiu, Erik G. Brandt, Johnny Israeli
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
3. Neither the name of the copyright holder nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
This is needed for any force field that uses Custom Forces. ๐ฆ
Please provide input if you are interested.
When ForceBalance encounters a bad optimization step due to a noisy objective function and gradient (containing statistical properties which are noisy), it rejects the optimization step, recalculates the gradient and Hessian, and takes a reduced-size step.
However, the gradient and Hessian from the repeated step are just as noisy as the initial step. Also, the noise term remains constant as the objective function converges, so towards the end of the optimization most of the steps are bad. Many of my optimizations are terminated by hand when they take too many bad steps, or I restart the optimization with more MD steps - neither solution is optimal.
We should be able to leverage the information from multiple calculations to improve the optimization step, since they were carried out at the same values of the parameters. The simplest thing to do is average the objective function, gradient, and Hessian values at the "revisited" parameter values but I doubt this is the strictly correct thing to do.
For example, a simple calculation shows that when the property has statistical noise (X + ฮ), the objective function (X + ฮ - E)^2 is on average larger than the noiseless function by an amount <ฮ^2>. This is obvious when the true property average agrees with experiment exactly - the objective function should be zero but it will never be zero.
When the property does not agree with experiment exactly, the objective function might be higher or lower than the true value, but on average it is higher. The variance of the objective function is something like this:
<ฮ^4 + 6ฮ^2(X - E)^2>
The standard error of the property ฮ should probably not be used to correct our objective function because it cannot be precisely evaluated; estimates of standard errors tend to be less precise than estimates of means, so it would worsen our estimate rather than improve it.
The main point is that by averaging multiple objective function values, we don't get the same objective function that would have resulted from a single calculation with proportionally longer trajectories; we would get a lower value from the latter. But we can't compare this lower value with the next optimization step because it will have the full noise again. In fact, the "averaged objective function" from multiple steps is probably the better comparison value for future optimization steps.
What about the gradient and Hessian? Simple calculations assuming <ฮ>=0 indicate that averaging the gradient over multiple calculations indeed converges to the true gradient.
More onerous are the Hessian diagonal elements. They are on average increased by a factor <ฮ(โi)^2> (i.e. the squared standard error in the ith component of the property gradient). This means that on average we are taking smaller optimization steps than we should, and this error would persist through averaging multiple Hessians. Taking too small of an optimization step is a bad thing, because they are more likely to be within the statistical noise.
So what should we do here? Possible solutions include:
Improving estimates of the Hessian through averaging should be done at the property calculation (i.e. "target") level, where the quantity being squared and summed is accessible - so that the Hessian diagonal elements are not inadvertently increased by the statistical noise.
Improving estimates of the objective function through averaging should be done at the optimization level, because they would incorporate the positive bias which subsequent optimization steps would share.
Each time the optimizer takes a step back and recalculates the objective function, subsequent evaluations of the objective function should increase the number of MD steps by a proportional factor. This would remove the requirement for (2), so all of the "averaging" calculations may be done at the target level. The downside is the optimization time would blow up. We might recover some efficiency by reducing bias in the Hessian diagonal elements but still...
One question is: When I repeat the liquid properties calculation for the same parameters after rejecting an optimization step, whether I need to rerun MBAR for the "doubled" collection of energies or just take their arithmetic mean. I don't think MBAR scales favorably with very large number of snapshots (I haven't run timings, just an impression from experience).
Let me know what you think. Thanks!
Starting a new topic because this focuses on how the experimental data files are stored on disk (these files are created / edited by the user), rather than how they are parsed or stored in memory. Note that we can't easily go the other way (i.e. convert from the internal representation to data tables on disk), but that's presently not an intended use case.
If you are thinking of a nontrivial data type, please mention it here and we can discuss how it can fit into this framework.
Context and Purpose: In ForceBalance, I presently have an existing, fairly rigid format for storing liquid and lipid bilayer properties (Liquid and Lipid targets), and a simple format for specifying general thermodynamic properties (Thermo target) as contributed by @ebran.
The following spec attempts to encompass more types of physical / chemical experimental data for force field parameterization, so that we can incorporate a larger number of properties into the Thermo target. The format is intended to be highly flexible for different data types without compromising on simplicity.
#
) character. Subsequent fields are ignored.=
), otherwise the line will not be recognized as metadata. Subsequent fields are the (case-sensitive) variable values. If any of the value fields starts with #
(comment), then it and subsequent fields are all ignored. Examples of metadata are discussed below.targets
containing initial conditions and simulation settings.targets
.observable (unit)
with optional space). The unit in the column header is the default for the whole column; if no units are provided, a hard-coded conventional unit will be used. An example of a unit would be 10^-4 K^-1
. (Note: We should use the SimTK unit conversion system to do unit conversions, but it would be nice if we could map the abbreviated units to the full unit name. I also have issues with bundling unit
with ForceBalance because these units are incompatible with the existing simtk.unit
.) Also note: observable (unit)
counts as a single field so the fixed-width parser must account for this.n
carbon atom names and the list of n-2
order parameters are needed. For a radial distribution function, it may be the list of atom separations, the two atom name selections used to calculate the RDF, or the list of RDF values. The data may be entered as multiple rows, in which case the other rows can be empty. Alternatively, a file reference may be provided in the form of other_file:column
where the values are read from the column of the other file (numbered from one). Note that the number of entries must be consistent (i.e. the number of atom separations must equal the number of RDF values, and the number of carbons must equal the number of deuterium order parameters + 2.)observable_
to override this behavior and apply to the chosen observable instead.observable_sigma
with one uncertainty per row. Note: This may be modified to specify the experimental and simulation uncertainty separately. My experience is that calculating the simulation uncertainty during the optimization leads to stability issues, so pre-computing the values is recommended.Target
class.thought it might be a good idea to talk about this on here.
currently the simulations run using the openmm engine in forcebalance are a bit inflexible. i think we could add flexibility in a stylistically consistent way if we allow the user to include elements of an openmm python script in the target directory, similar to how gromacs liquid simulation defaults are overwritten a by liquid.mdp file. so there are defaults in npt.py and then anything in the targets/targetx/sim.py file takes precedence.
maybe to prevent the need for super general parsing, we could fix a few variables, and the sim.py script would be required to use these variables? so we could write a template script and the user could add attributes.
The release schedule for ForceBalance is rather arbitrary, but soon it will be a good time to update the version to 1.4 because of several new features. Also I need to refresh my memory on how to generate the documentation.
Major features:
EVAL
can now be functions of one anotherMinor features:
Here's what needs to be done first:
ITERATION
global variableA declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.