Non-bonded cut-off schemes¶
The default cut-off scheme in GROMACS 2019.6 is based on classical buffered Verlet lists. These are implemented extremely efficiently on modern CPUs and accelerators, and support nearly all of the algorithms used in GROMACS.
Before version 4.6, GROMACS always used pair-lists based on groups of particles. These groups of particles were originally charge-groups, which were necessary with plain cut-off electrostatics. With the use of PME (or reaction-field with a buffer), charge groups are no longer necessary (and are ignored in the Verlet scheme). In GROMACS 4.6 and later, the group-based cut-off scheme is still available, but is deprecated since 5.0. It is still available mainly for backwards compatibility, to support the algorithms that have not yet been converted, and for the few cases where it may allow faster simulations with bio-molecular systems dominated by water.
Without PME, the group cut-off scheme should generally be combined with a buffered pair-list to help avoid artifacts. However, the group-scheme kernels that can implement this are much slower than either the unbuffered group-scheme kernels, or the buffered Verlet-scheme kernels. Use of the Verlet scheme is strongly encouraged for all kinds of simulations, because it is easier and faster to run correctly. In particular, GPU acceleration is available only with the Verlet scheme.
The Verlet scheme uses properly buffered lists with exact cut-offs.
The size of the buffer is chosen with verlet-buffer-tolerance
to permit a certain level of drift. Both the LJ and Coulomb potential
are shifted to zero by subtracting the value at the cut-off. This
ensures that the energy is the integral of the force. Still it is
advisable to have small forces at the cut-off, hence to use PME or
reaction-field with infinite epsilon.
Non-bonded scheme feature comparison¶
All GROMACS 2019.6 features not directly related to non-bonded interactions are supported in both schemes. Eventually, all non-bonded features will be supported in the Verlet scheme. A table describing the compatibility of just non-bonded features with the two schemes is given below.
Table: Support levels within the group and Verlet cut-off schemes for features related to non-bonded interactions
Feature | group | Verlet |
---|---|---|
unbuffered cut-off scheme | default | not by default |
exact cut-off | shift/switch | always |
potential-shift interactions | yes | yes |
potential-switch interactions | yes | yes |
force-switch interactions | yes | yes |
switched potential | yes | yes |
switched forces | yes | yes |
non-periodic systems | yes | Z + walls |
free energy perturbed non-bondeds | yes | yes |
energy group contributions | yes | only on CPU |
energy group exclusions | yes | no |
OpenMP multi-threading | only PME | all |
native GPU support | no | yes |
Coulomb PME | yes | yes |
Lennard-Jones PME | yes | yes |
virtual sites | yes | yes |
User-supplied tabulated interactions | yes | no |
Buckingham VdW interactions | yes | no |
rcoulomb != rvdw | yes | yes |
twin-range | no | no |
Performance¶
The performance of the group cut-off scheme depends very much on the composition of the system and the use of buffering. There are optimized kernels for interactions with water, so anything with a lot of water runs very fast. But if you want properly buffered interactions, you need to add a buffer that takes into account both charge-group size and diffusion, and check each interaction against the cut-off length each time step. This makes simulations much slower. The performance of the Verlet scheme with the new non-bonded kernels is independent of system composition and is intended to always run with a buffered pair-list. Typically, buffer size is 0 to 10% of the cut-off, so you could win a bit of performance by reducing or removing the buffer, but this might not be a good trade-off of simulation quality.
The table below shows a performance comparison of most of the relevant setups. Any atomistic model will have performance comparable to tips3p (which has LJ on the hydrogens), unless a united-atom force field is used. The performance of a protein in water will be between the tip3p and tips3p performance. The group scheme is optimized for water interactions, which means a single charge group containing one particle with LJ, and 2 or 3 particles without LJ. Such kernels for water are roughly twice as fast as a comparable system with LJ and/or without charge groups. The implementation of the Verlet cut-off scheme has no interaction-specific optimizations, except for only calculating half of the LJ interactions if less than half of the particles have LJ. For molecules solvated in water the scaling of the Verlet scheme to higher numbers of cores is better than that of the group scheme, because the load is more balanced. On the most recent Intel CPUs, the absolute performance of the Verlet scheme exceeds that of the group scheme, even for water-only systems.
Table: Performance in ns/day of various water systems under different non-bonded setups in GROMACS using either 8 thread-MPI ranks (group scheme), or 8 OpenMP threads (Verlet scheme). 3000 particles, 1.0 nm cut-off, PME with 0.11 nm grid, dt=2 fs, Intel Core i7 2600 (AVX), 3.4 GHz + Nvidia GTX660Ti
system | group, unbuffered | group, buffered | Verlet, buffered | Verlet, buffered, GPU |
---|---|---|---|---|
tip3p, charge groups | 208 | 116 | 170 | 450 |
tips3p, charge groups | 129 | 63 | 162 | 450 |
tips3p, no charge groups | 104 | 75 | 162 | 450 |
How to use the Verlet scheme¶
The Verlet scheme is enabled by default with option cutoff-scheme
.
The value of mdp option verlet-buffer-tolerance
will add a
pair-list buffer whose size is tuned for the given energy drift (in
kJ/mol/ns per particle). The effective drift is usually much lower, as
gmx grompp assumes constant particle velocities. (Note that in single
precision for normal atomistic simulations constraints cause a drift
somewhere around 0.0001 kJ/mol/ns per particle, so it doesn’t make sense
to go much lower.) Details on how the buffer size is chosen can be
found in the reference below and in the reference manual.
For constant-energy (NVE) simulations, the buffer size will be
inferred from the temperature that corresponds to the velocities
(either those generated, if applicable, or those found in the input
configuration). Alternatively, verlet-buffer-tolerance
can be set
to -1 and a buffer set manually by specifying rlist
greater than
the larger of rcoulomb
and rvdw
. The simplest way to get a
reasonable buffer size is to use an NVT mdp file with the target
temperature set to what you expect in your NVE simulation, and
transfer the buffer size printed by gmx grompp to your NVE mdp file.
When a GPU is used, nstlist is automatically increased by gmx mdrun, usually to 20 or more; rlist is increased along to stay below the target energy drift. Further information on running gmx mdrun with GPUs is available.
Further information¶
For further information on algorithmic and implementation details of the Verlet cut-off scheme and the MxN kernels, as well as detailed performance analysis, please consult the following article: