mjolnir-md / mjolnir Goto Github PK

View Code? Open in Web Editor NEW

31.0 8.0 6.0 11.25 MB

:zap: general purpose coarse-grained molecular dynamics simulation package

Home Page: https://mjolnir-md.github.io/Mjolnir/

License: MIT License

CMake 1.22% C++ 98.57% Python 0.17% Shell 0.04%

molecular-dynamics-simulation c-plus-plus-11 coarse-grained c-plus-plus science simulation openmp

mjolnir's People

Contributors

Stargazers

Watchers

Forkers

0h-n0 toruniina yutakasi634 0ncorhynchus zeta1999

mjolnir's Issues

move `simulator.parameters` to `simulator.integrator.parameters`

Currently, parameters for an integrator are given in [simulator] table. But the parameters are for the integrator, not for the simulator itself.

It might be confusing. By moving parameters from just under [simulator] to [simulator.integrator], it becomes clearer.

To do this, we also need to move integrator value to integrator.type because the type of integrator becomes a toml::table.

So the [simulator] table will be like this.

[simulator]
type = "Molecular Dynamics"
integrator.type = "Underdamped Langevin"
integrator.parameters = [
    {index = 0, gamma = 0.1},
    # ...
]

Rename variable 2

What about changing "chain" to group and "group" to selection?
chain -> group
group -> selection

The reason is that sometimes the user does not have a "chain" but rather a more "abstract" (or complex) structure.

Explicitly set `PeriodicGaussian` for `DihedralAngleInteraction`

Currently, AngularGaussianPotential is used when users sets Gaussian for DihedralAngleInteraction as a part of LocalForceField, instead of normal GaussianPotential.
Unlike the GaussianPotential, AngularGaussianPotential considers the boundary at [-π, π] and periodically transposes the angle coordinate.

Some force fields requires this feature, but others does not. This violates the rule.

To do everything that a user wants and nothing more than that.

So, we need to add an option to use normal and periodic gaussian potential explicitly.
Also, it might become clearer to rename AngularGaussian as PeriodicGaussian.

time elapsed in the simulation

Currently, integrator (e.g. VelocityVerletStepper) returns the next time as time + dt;. It causes a numerical error. If an integer is used to count a number of steps, this problem does not happen.

This class interface is designed in order to allow to use adaptive stepsize integration algorithms (e.g. RKF). But it might be good to consider a different way to support them.

support PDNS interaction

enable Observer to control format after reading input file

add ObserverBase or something like that.

update toml11

v3.1.0 is released.

Multiple Basin Forcefield

I already have a branch that partially implement it and it seems to work, but it lacks some features.

It possibly requires some design change around ForceField.

The main difficulties are the following.

The default ForceField should be able to be exchanged with MultipleBasinForceField.
- MultipleBasinForceField correctly mix the forces from each basin
- These forcefield specific stuff should be separated from the default one
- Currently, all the simulators have the default ForceField. ForceFieldBase or something like that is needed.
Several distinct MultipleBasinForceField should be able to run simultaneously
- There can be two different proteins that have 2 states in a system
- It makes the input file and the implementation complicated. A sophisticated way is needed.
Energy and Forces should be calculated efficiently at the same time
- not so difficult (just add calc_energy_and_force() to the interactions), but requires huge efforts.

introduce particle_id_type instead of std::size_t

In most cases, std::uint32_t is enough to hold an index of particles (or even std::uint16_t is enough under some conditions). The size of std::size_t depends on platform, but it can be too large.

To reduce the memory usage in NeighborList or some other particle_index stuff, we can introduce particle_index_type and use it instead of std::size_t. If it make memory footprint smaller and simulation faster, it should be used. I think it is worth trying.

progress bar and log file

Currently it writes a progress bar by default, but on an environment that automatically redirects stdout and stderr to files, the file would be filled with tons of progress bars. It is not good.

A flag in the input file or something like that is needed to suppress outputting progress bar.

rename value in the input files

proposed changes:

simulator.scheme -> simulator.integrator

Organize multiple Observers

enable to set different observers that output different quantity in a different format each other.

We need to organize them in a convenient way

each simulator manages all the observer or
add Observer to manage all the observers and replace unique_ptr<ObserverBase> with it.

Maybe the latter idea is appropriate.

Consider moving to C++17

merge calc_force and calc_energy via boolean template parameter

It depends on #286 .

In C++17, we can use constexpr-if. It reduces the required effort to implement interactions.

use UpperCamelCase for all the input values

Most of the forcefield parameters use UpperCamelCase just like with the names of the classes. But only the variable simulator.type and simulator.integrator use spaces between words.

To avoid confusion, the format must be uniform. Make the values listed below UpperCamelCase.

"Molecular Dynamics" -> "MolecularDynamics"
"Steepest Descent" -> "SteepestDescent"
"Simulated Annealing" -> "SimulatedAnnealing"
"Underdamped Langevin" -> "UnderdampedLangevin"
"Velocity Verlet" -> "VelocityVerlet"

particle without charge

Currently, GlobalPotentials require that all the particles should have their own parameter(s).

But there might be a particle that doesn't have parameter, like a particle with no charge. They can have a zero value as their parameters (current way), or they can simply be ignored. There should be an efficient way to manage parameter-less particles in GlobalPotentials.

Implement both and compare the runtime performance.

enable to define a variable in forcefield

Generally, to pass a forcefield as an input file, we need to write many parameters. For most parameters, there are really few different values compared to the number of parameters need to be written. The reason is that some of the particles might represent the same atom or pseudo-atom (like a group of atoms).

It is probably good to enable to define a variable in a toml file. There are several reasons like the following.

one can explicitly represent some of the parameter values are the same
- i.e. one can ensure the values of the parameters means the same thing
one can reduce the input file size. It takes less effort to write.

Something like this. The format may change later.

[[forcefield.local]]
type = "BondLength"
potential = "Harmonic"
topology = "bond"
parameter.C-H_energy = 98.7
parameter.C-H_length = 1.09
parameters = [
    {indices = [0, 1], v0 = "C-H_length", k = "C-H_energy"},
    {indices = [1, 2], v0 = "C-H_length", k = "C-H_energy"},
    # ...
]

support PWMcos interaction

Add combination-rule class and simplify implementation in global potential

Currently, global potential class has two different roles.

Calculating force and potential
Storing parameters and generate parameters for a specific pair

The first one is similar to that of the local potential.
To make it concise, it is better to split global potential class into two.

rename T_begin/end

SimulatedAnnealingSimulator requires the following parameters.

[simulator]
schedule = "linear"
T_begin  = 300.0
T_end    =  10.0

I think the following format is clearer.

[simulator]
schedule.type  = "linear"
schedule.begin = 300.0
schedule.end   = 10.0

documents in English

needless to say.

enable to use tuple of potentials as a potential

Several potential function are possibly applied to the same interaction-unit (e.g. bond angle, dihedral angle, etc...).
Calculating same structural variable like dihedral angle twice is apparently needless overhead. But caching this kind of information somewhere may introduce complexity to the code. It is worth considering making tuple of potentials as a potential to be called by Interaction.
For example,

template<typename ... potentialTs>
class SetofPotentials
{
/*  some stuff...   */
  public:
    real_type calc_potential(const real_type x) const
    {
        return accumulate_potentials_recursively(pots_, x);
    }
  private:
    std::tuple<potentialTs...> pots_;
};

Is the name "Go1012Contact" clear?

It is not clear name. It should be "GoContact".

note: the name comes from the original implementation and it means "10-12 Lenanrd-Jones type potential".

add `calc_force_and_virial`

after adding virial calculation, the performance decreases ~15% in some cases. Maybe it is better to control the virial calculation, by adding calc_force_virial, calc_force_energy_virial, calc_force.

The implementation should be shared among those, by adding some template parameter to use constexpr if, as suggested in #291 .

DCD Header

Currently, when creating a DCD file, Mjolnir writes in the fourth byte of the first block the number 0:

00 - 04 --> 84 // Block size
05 - 08 --> "CORD" // Specifier
09 - 12 --> number // Number of frames
13 - 16 --> 0 // Previous steps if this is a continuation from another trajectory, currently is just 0
17 - 20 --> 0 // Number of steps between saves, currently is just 0
.
.
.
81 - 84 --> 24 // CHARMM version

Since version 24 is specified, when analyzing trajectories with other libraries like MDAnalysis, it fails to load because it tries to guess the "current time", but the number of "time steps between saves" (bytes 17 to 20) is 0. One solution would be to edit manually the output binary file (joke 😋). The other solution could be just passing this information when initializing the DCDObserver.

Currently, the constructor virtual function has the following signature:

// open files, write header and so on.                                       
virtual void initialize(const std::size_t total_step, const real_type dt, const system_type&, const forcefield_type&) = 0;

We could pass an extra argument, save_step_, by modifying the signature, but this use case is only relevant to DCD files. What could be the best way to pass this value to the DCDObserver without changing the base clase signature? I don't know which feature of c++ is the right one for this use case... 😞

Change MultipleBasin input format

The current forcefield implementation only allows one kind of ForceField, not a mixture of those.

determine the tolerance

Currently, the tolerances used in test codes are not fixed. Because it affects to the implementation of some funcitons, tolerance is needed to be determined.

store fastest speed in system

Most of spatial partition method including Verlet list and Cell list use margin to use it over some steps. The margin length decrease by largest deviation in each step. Currently, every spatial partition type searches fastest particle in the system (and the speed possibly not exact value at time t, but t+dt/2). It is an unnecessary overhead and able to be removed caching fastest speed at the time step in system class.

Rename variable

I think the equilibrium value (length, angle, etc.) of a potential should be called "var_eq" instead of "var_native" because "native" applies only when this value is taken from the native conformation of a biomolecule (crystal structure, etc.).

cmake option to skip tests

-DCOMPILE_TESTS=[ON|OFF]

cutoff ratio of excluded volume

magic number 0.4 is used in the function calc_force().

try mdbook or docsify for docs

Since they say gitbook CLI is no longer under active development.
If another option is easy to use, then use it. If not so much, try gitbook cloud service.

specialize ExternalDistanceInteraction to more concrete types

Currently, there is ExternalDistanceInteraction class and it receives Shape class such as AxisAlignedPlane. By doing this, we can encapsulate the detail of shapes. It means that to add a spherical cap, we need only a "Sphere" class that has a member function to calculate normal vector.

But it turned out that the scope of this class is a bit limited. For example, a normal vector of Box is always perpendicular to one of the planes of which the box is composed. But the force applied to a particle is not always parallel to the normal because a particle may interact with several planes.

Still, we can use ExternalDistanceInteraction, but I decided that I will gradually replace it with more concrete types such as PlanarSurfaceInteraction. It requires more effort and lines of codes, but sometimes (not always, of course) longer and specific code can be more helpful than abstraction.

enable to ignore inter- or intra-group interaction in global interaction

Currently, there are only 2 ways to ignore interactions between some particles, ignore.particles_within.bond = 3 and ignore.molecule = "Inter". The former one provides a way to ignore particles that already interact with each other through local forcefield such as a bond interaction. The latter provides a way to ignore inter- or intra-molecules. But a molecule is defined as a set of particles connected by bonds. So, if one wants to turn off the interaction between some groups that include several molecules, such as DNA (normally includes 2 molecules) and proteins, currently there is no way to do that.

The feature is needed.

remove deprecated features

There are some deprecated features. like:

Periodic Gaussian for dihedral
- just use Gaussian instead.
Harmonic for dihedral interaction

Since these have been marked as deprecated for long, these will be removed in the next release.

pass string to Interactions.dump_energy()

Currently, it returns a std::vector of interaction names and real_types, but it is better to pass a string to which the interaction writes directly.
It makes parallelization difficult, but the file output is not parallelized. There is no plan to parallelize a single file output.
The impact on the runtime efficiency is not so large, but it makes file output more flexible.

energy calculation mode

By enabling to load trajectories from a file, it can be achieved.

rename Stepper -> Integrator

class names should be the same as the name in input files.

enable to generate initial velocities

generate velocities according to the Maxwell–Boltzmann distribution if initial velocities are not provided.

Currently, the initial velocity is required in the input file but it is inconvenient in most cases. It is nice to have a way to automatically generate initial velocities.

enable to split an input file

Currently, input file contains everything that is needed to run a simulation. It is good to have an option to specify configuration, forcefield, simulator separately.

The input file format will be like this.

[[systems]]
filename = "system.toml"
[[forcefields]]
filename = "forcefield.toml"

move RandomNumberGenerator from Integrator to Simulator

Currently, Integrator has a RandomNumberGenerator (RNG) because that is the only thing that uses RNG. But RNG is heavily used in a wide variety of tools. In some cases other components such as Simulator itself need it. Having several RNGs (without any appropriate context) is confusing.

Now I think it is good to move RNG from Integrator to Simulator. I think the impact of this change on the efficiency is small, but if it turned out that it slows down the simulation drastically, I will re-consider the design.

enable to set positions of data points in FlexibleLocalAngle

FlexibleLocalAngle potential is a kind of table potential, so essentially it can have a completely different shape from the default values.

Upgrade toml11 from v1 to v2

related: #24

Pros

compatible with the latest version of TOML (v0.5.0)
great error messages
more powerful utility functions

Cons

I need to spend my time a bit

enable to choose output content

Currently, observer always outputs position and velocity. But one may don't need to have a distribution of velocities. Or one may want to have a time course of forces. So it is good to have an option to choose which values to be written.

The difficult thing is that the file content depends on the format. Some format can only have one vector per one particle, but others can have an arbitrary number of parameters per one particle. The input file format should be united. Thus the way to choose the file content should be considered carefully.

Something like this.

[files]
output.format = "xyz"
output.content = ["position", "velocity"]
# or
[files]
output.position = "xyz"
output.velocity = "xyz"

add interface to reduce margin

Currently, System stores largest_displacement and ForceField::update reduces margin in NeighborList releated classes. ForceField::calc_force automaticall calls update, so calculating force implies reducing margin.

It makes difficult to understand how the simulation runs. Also, it makes almost impossible to implement Monte-Carlo method. Although now I don't have a plan to implement MC simulations, there is no reason to limit the options unless it decreases readability.

By adding reduce_margin(real_type dmargin) to each ForceField, this issue can be resolved.

update sample input files

simplify the implementation of FlexibleLocalAngle

Currently, it initializes theta-related terms (min_theta, max_theta, dtheta, thetas) but the values are hard-coded.
It is good to ...

stop hard coding and make it configurable

make it constexpr

In the almost all the use cases, people seem to use the default parameter. Also, when users apply their own parameter, they modify only a energy part.
So I think the second one is nicer. The first one makes the input file too wide.

By making it configurable, we will get something like a table potential.
But if a table potential is needed, I think it is better to implement it independently from FlexibleLocal.

merge files related to the unit system

merge core/constant.hpp into core/Units.hpp.

simplify input for 3SPN2

Currently, the input for 3SPN2 is a bit complicated and is not easy to use (also, it is not documented at all).

It can be simplified a bit.