Gromacs  2020.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Typedefs | Enumerations | Functions
#include <string>
#include "gromacs/gpu_utils/devicebuffer_datatype.h"
#include "gromacs/gpu_utils/gpu_macros.h"
#include "gromacs/math/vectypes.h"
#include "gromacs/timing/walltime_accounting.h"
#include "gromacs/utility/arrayref.h"
#include "gromacs/utility/basedefinitions.h"
#include "gromacs/utility/real.h"
+ Include dependency graph for pme.h:
+ This graph shows which files directly or indirectly include this file:

Description

This file contains function declarations necessary for computing energies and forces for the PME long-ranged part (Coulomb and LJ).

Author
Berk Hess hess@.nosp@m.kth..nosp@m.se

Macros

#define GMX_PME_SPREAD   (1 << 0)
 Flag values that control what gmx_pme_do() will calculate. More...
 
#define GMX_PME_SOLVE   (1 << 1)
 
#define GMX_PME_CALC_F   (1 << 2)
 
#define GMX_PME_CALC_ENER_VIR   (1 << 3)
 
#define GMX_PME_CALC_POT   (1 << 4)
 
#define GMX_PME_DO_ALL_F   (GMX_PME_SPREAD | GMX_PME_SOLVE | GMX_PME_CALC_F)
 

Typedefs

using PmeGpuProgramHandle = const PmeGpuProgram *
 Convenience name.
 

Enumerations

enum  { GMX_SUM_GRID_FORWARD, GMX_SUM_GRID_BACKWARD }
 
enum  PmeRunMode { PmeRunMode::None, PmeRunMode::CPU, PmeRunMode::GPU, PmeRunMode::Mixed }
 Possible PME codepaths on a rank. More...
 
enum  PmeForceOutputHandling { PmeForceOutputHandling::Set, PmeForceOutputHandling::ReduceWithInput }
 PME gathering output forces treatment. More...
 

Functions

int minimalPmeGridSize (int pmeOrder)
 Return the smallest allowed PME grid size for pmeOrder.
 
bool gmx_pme_check_restrictions (int pme_order, int nkx, int nky, int nkz, int numPmeDomainsAlongX, bool useThreads, bool errorsAreFatal)
 Check restrictions on pme_order and the PME grid nkx,nky,nkz. More...
 
gmx_pme_t * gmx_pme_init (const t_commrec *cr, const NumPmeDomains &numPmeDomains, const t_inputrec *ir, gmx_bool bFreeEnergy_q, gmx_bool bFreeEnergy_lj, gmx_bool bReproducible, real ewaldcoeff_q, real ewaldcoeff_lj, int nthread, PmeRunMode runMode, PmeGpu *pmeGpu, const gmx_device_info_t *gpuInfo, PmeGpuProgramHandle pmeGpuProgram, const gmx::MDLogger &mdlog)
 Construct PME data. More...
 
void gmx_pme_destroy (gmx_pme_t *pme)
 Destroys the PME data structure.
 
int gmx_pme_do (struct gmx_pme_t *pme, gmx::ArrayRef< const gmx::RVec > coordinates, gmx::ArrayRef< gmx::RVec > forces, real chargeA[], real chargeB[], real c6A[], real c6B[], real sigmaA[], real sigmaB[], const matrix box, const t_commrec *cr, int maxshift_x, int maxshift_y, t_nrnb *nrnb, gmx_wallcycle *wcycle, matrix vir_q, matrix vir_lj, real *energy_q, real *energy_lj, real lambda_q, real lambda_lj, real *dvdlambda_q, real *dvdlambda_lj, int flags)
 Do a PME calculation on a CPU for the long range electrostatics and/or LJ. More...
 
int gmx_pmeonly (struct gmx_pme_t *pme, const t_commrec *cr, t_nrnb *mynrnb, gmx_wallcycle *wcycle, gmx_walltime_accounting_t walltime_accounting, t_inputrec *ir, PmeRunMode runMode)
 Called on the nodes that do PME exclusively.
 
void gmx_pme_calc_energy (gmx_pme_t *pme, gmx::ArrayRef< const gmx::RVec > x, gmx::ArrayRef< const real > q, real *V)
 Calculate the PME grid energy V for n charges. More...
 
void gmx_pme_send_parameters (const t_commrec *cr, const interaction_const_t *ic, gmx_bool bFreeEnergy_q, gmx_bool bFreeEnergy_lj, real *chargeA, real *chargeB, real *sqrt_c6A, real *sqrt_c6B, real *sigmaA, real *sigmaB, int maxshift_x, int maxshift_y)
 Send the charges and maxshift to out PME-only node.
 
void gmx_pme_send_coordinates (t_forcerec *fr, const t_commrec *cr, const matrix box, const rvec *x, real lambda_q, real lambda_lj, gmx_bool bEnerVir, int64_t step, bool useGpuPmePpComms, bool reinitGpuPmePpComms, bool sendCoordinatesFromGpu, GpuEventSynchronizer *coordinatesReadyOnDeviceEvent, gmx_wallcycle *wcycle)
 Send the coordinates to our PME-only node and request a PME calculation.
 
void gmx_pme_send_finish (const t_commrec *cr)
 Tell our PME-only node to finish.
 
void gmx_pme_send_resetcounters (const t_commrec *cr, int64_t step)
 Tell our PME-only node to reset all cycle and flop counters.
 
void gmx_pme_receive_f (gmx::PmePpCommGpu *pmePpCommGpu, const t_commrec *cr, gmx::ForceWithVirial *forceWithVirial, real *energy_q, real *energy_lj, real *dvdlambda_q, real *dvdlambda_lj, bool useGpuPmePpComms, bool receivePmeForceToGpu, float *pme_cycles)
 PP nodes receive the long range forces from the PME nodes.
 
void gmx_pme_reinit_atoms (gmx_pme_t *pme, int numAtoms, const real *charges)
 This function updates the local atom data on GPU after DD (charges, coordinates, etc.). TODO: it should update the PME CPU atom data as well. (currently PME CPU call gmx_pme_do() gets passed the input pointers for each computation). More...
 
bool pme_gpu_supports_build (std::string *error)
 Checks whether the GROMACS build allows to run PME on GPU. TODO: this partly duplicates an internal PME assert function pme_gpu_check_restrictions(), except that works with a formed gmx_pme_t structure. Should that one go away/work with inputrec? More...
 
bool pme_gpu_supports_hardware (const gmx_hw_info_t &hwinfo, std::string *error)
 Checks whether the detected (GPU) hardware allows to run PME on GPU. More...
 
bool pme_gpu_supports_input (const t_inputrec &ir, const gmx_mtop_t &mtop, std::string *error)
 Checks whether the input system allows to run PME on GPU. TODO: this partly duplicates an internal PME assert function pme_gpu_check_restrictions(), except that works with a formed gmx_pme_t structure. Should that one go away/work with inputrec? More...
 
PmeRunMode pme_run_mode (const gmx_pme_t *pme)
 Returns the active PME codepath (CPU, GPU, mixed). More...
 
gmx::PinningPolicy pme_get_pinning_policy ()
 Return the pinning policy appropriate for this build configuration for relevant buffers used for PME task on this rank (e.g. running on a GPU).
 
bool pme_gpu_task_enabled (const gmx_pme_t *pme)
 Tells if PME is enabled to run on GPU (not necessarily active at the moment). More...
 
int pme_gpu_get_padding_size (const gmx_pme_t *pme)
 Returns the size of the padding needed by GPU version of PME in the coordinates array. More...
 
void pme_gpu_reset_timings (const gmx_pme_t *pme)
 Resets the PME GPU timings. To be called at the reset step. More...
 
void pme_gpu_get_timings (const gmx_pme_t *pme, gmx_wallclock_gpu_pme_t *timings)
 Copies the PME GPU timings to the gmx_wallclock_gpu_pme_t structure (for log output). To be called at the run end. More...
 
void pme_gpu_prepare_computation (gmx_pme_t *pme, bool needToUpdateBox, const matrix box, gmx_wallcycle *wcycle, int flags, bool useGpuForceReduction)
 Prepares PME on GPU computation (updating the box if needed) More...
 
void pme_gpu_launch_spread (gmx_pme_t *pme, GpuEventSynchronizer *xReadyOnDevice, gmx_wallcycle *wcycle)
 Launches first stage of PME on GPU - spreading kernel. More...
 
void pme_gpu_launch_complex_transforms (gmx_pme_t *pme, gmx_wallcycle *wcycle)
 Launches middle stages of PME (FFT R2C, solving, FFT C2R) either on GPU or on CPU, depending on the run mode. More...
 
void pme_gpu_launch_gather (const gmx_pme_t *pme, gmx_wallcycle *wcycle, PmeForceOutputHandling forceTreatment)
 Launches last stage of PME on GPU - force gathering and D2H force transfer. More...
 
bool pme_gpu_try_finish_task (gmx_pme_t *pme, int flags, gmx_wallcycle *wcycle, gmx::ForceWithVirial *forceWithVirial, gmx_enerdata_t *enerd, GpuTaskCompletion completionKind)
 Attempts to complete PME GPU tasks. More...
 
void pme_gpu_wait_and_reduce (gmx_pme_t *pme, int flags, gmx_wallcycle *wcycle, gmx::ForceWithVirial *forceWithVirial, gmx_enerdata_t *enerd)
 Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed). More...
 
void pme_gpu_reinit_computation (const gmx_pme_t *pme, gmx_wallcycle *wcycle)
 The PME GPU reinitialization function that is called both at the end of any PME computation and on any load balancing. More...
 
DeviceBuffer< float > pme_gpu_get_device_x (const gmx_pme_t *pme)
 Get pointer to device copy of coordinate data. More...
 
void pme_gpu_set_device_x (const gmx_pme_t *pme, DeviceBuffer< float > d_x)
 Set pointer to device copy of coordinate data. More...
 
void * pme_gpu_get_device_f (const gmx_pme_t *pme)
 Get pointer to device copy of force data. More...
 
void * pme_gpu_get_device_stream (const gmx_pme_t *pme)
 Returns the pointer to the GPU stream. More...
 
void * pme_gpu_get_device_context (const gmx_pme_t *pme)
 Returns the pointer to the GPU context. More...
 
GpuEventSynchronizerpme_gpu_get_f_ready_synchronizer (const gmx_pme_t *pme)
 Get pointer to the device synchronizer object that allows syncing on PME force calculation completion. More...
 

Macro Definition Documentation

#define GMX_PME_SPREAD   (1 << 0)

Flag values that control what gmx_pme_do() will calculate.

These can be combined with bitwise-OR if more than one thing is required.

Enumeration Type Documentation

PME gathering output forces treatment.

Enumerator
Set 

Gather simply writes into provided force buffer.

ReduceWithInput 

Gather adds its output to the buffer.

On GPU, that means additional H2D copy before the kernel launch.

enum PmeRunMode
strong

Possible PME codepaths on a rank.

Todo:
: make this enum class with gmx_pme_t C++ refactoring
Enumerator
None 

No PME task is done.

CPU 

Whole PME computation is done on CPU.

GPU 

Whole PME computation is done on GPU.

Mixed 

Mixed mode: only spread and gather run on GPU; FFT and solving are done on CPU.

Function Documentation

void gmx_pme_calc_energy ( gmx_pme_t *  pme,
gmx::ArrayRef< const gmx::RVec x,
gmx::ArrayRef< const real q,
real V 
)

Calculate the PME grid energy V for n charges.

The potential (found in pme) must have been found already with a call to gmx_pme_do() with at least GMX_PME_SPREAD and GMX_PME_SOLVE specified. Note that the charges are not spread on the grid in the pme struct. Currently does not work in parallel or with free energy.

bool gmx_pme_check_restrictions ( int  pme_order,
int  nkx,
int  nky,
int  nkz,
int  numPmeDomainsAlongX,
bool  useThreads,
bool  errorsAreFatal 
)

Check restrictions on pme_order and the PME grid nkx,nky,nkz.

With errorsAreFatal=true, an exception or fatal error is generated on violation of restrictions. With errorsAreFatal=false, false is returned on violation of restrictions. When all restrictions are obeyed, true is returned. Argument useThreads tells if any MPI rank doing PME uses more than 1 threads. If at calling useThreads is unknown, pass true for conservative checking.

The PME GPU restrictions are checked separately during pme_gpu_init().

int gmx_pme_do ( struct gmx_pme_t *  pme,
gmx::ArrayRef< const gmx::RVec coordinates,
gmx::ArrayRef< gmx::RVec forces,
real  chargeA[],
real  chargeB[],
real  c6A[],
real  c6B[],
real  sigmaA[],
real  sigmaB[],
const matrix  box,
const t_commrec *  cr,
int  maxshift_x,
int  maxshift_y,
t_nrnb *  nrnb,
gmx_wallcycle *  wcycle,
matrix  vir_q,
matrix  vir_lj,
real energy_q,
real energy_lj,
real  lambda_q,
real  lambda_lj,
real dvdlambda_q,
real dvdlambda_lj,
int  flags 
)

Do a PME calculation on a CPU for the long range electrostatics and/or LJ.

Computes the PME forces and the energy and viral, when requested, for all atoms in coordinates. Forces, when requested, are added to the buffer forces, which is allowed to contain more elements than the number of elements in coordinates. The meaning of flags is defined above, and determines which parts of the calculation are performed.

Returns
0 indicates all well, non zero is an error code.
gmx_pme_t* gmx_pme_init ( const t_commrec *  cr,
const NumPmeDomains numPmeDomains,
const t_inputrec *  ir,
gmx_bool  bFreeEnergy_q,
gmx_bool  bFreeEnergy_lj,
gmx_bool  bReproducible,
real  ewaldcoeff_q,
real  ewaldcoeff_lj,
int  nthread,
PmeRunMode  runMode,
PmeGpu pmeGpu,
const gmx_device_info_t gpuInfo,
PmeGpuProgramHandle  pmeGpuProgram,
const gmx::MDLogger mdlog 
)

Construct PME data.

Exceptions
gmx::InconsistentInputErrorif input grid sizes/PME order are inconsistent.
Returns
Pointer to newly allocated and initialized PME data.
Todo:
We should evolve something like a GpuManager that holds gmx_device_info_t * and PmeGpuProgramHandle and perhaps other related things whose lifetime can/should exceed that of a task (or perhaps task manager). See Redmine #2522.
void gmx_pme_reinit_atoms ( gmx_pme_t *  pme,
int  numAtoms,
const real charges 
)

This function updates the local atom data on GPU after DD (charges, coordinates, etc.). TODO: it should update the PME CPU atom data as well. (currently PME CPU call gmx_pme_do() gets passed the input pointers for each computation).

Parameters
[in,out]pmeThe PME structure.
[in]numAtomsThe number of particles.
[in]chargesThe pointer to the array of particle charges.
void* pme_gpu_get_device_context ( const gmx_pme_t *  pme)

Returns the pointer to the GPU context.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to GPU context object.
void* pme_gpu_get_device_f ( const gmx_pme_t *  pme)

Get pointer to device copy of force data.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to force data
void* pme_gpu_get_device_stream ( const gmx_pme_t *  pme)

Returns the pointer to the GPU stream.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to GPU stream object.
DeviceBuffer<float> pme_gpu_get_device_x ( const gmx_pme_t *  pme)

Get pointer to device copy of coordinate data.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to coordinate data
GpuEventSynchronizer* pme_gpu_get_f_ready_synchronizer ( const gmx_pme_t *  pme)

Get pointer to the device synchronizer object that allows syncing on PME force calculation completion.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to sychronizer
int pme_gpu_get_padding_size ( const gmx_pme_t *  pme)

Returns the size of the padding needed by GPU version of PME in the coordinates array.

Parameters
[in]pmeThe PME data structure.
void pme_gpu_get_timings ( const gmx_pme_t *  pme,
gmx_wallclock_gpu_pme_t timings 
)

Copies the PME GPU timings to the gmx_wallclock_gpu_pme_t structure (for log output). To be called at the run end.

Parameters
[in]pmeThe PME structure.
[in]timingsThe gmx_wallclock_gpu_pme_t structure.
void pme_gpu_launch_complex_transforms ( gmx_pme_t *  pme,
gmx_wallcycle *  wcycle 
)

Launches middle stages of PME (FFT R2C, solving, FFT C2R) either on GPU or on CPU, depending on the run mode.

Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
void pme_gpu_launch_gather ( const gmx_pme_t *  pme,
gmx_wallcycle *  wcycle,
PmeForceOutputHandling  forceTreatment 
)

Launches last stage of PME on GPU - force gathering and D2H force transfer.

Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
[in]forceTreatmentTells how data should be treated. The gathering kernel either stores the output reciprocal forces into the host array, or copies its contents to the GPU first and accumulates. The reduction is non-atomic.
void pme_gpu_launch_spread ( gmx_pme_t *  pme,
GpuEventSynchronizer xReadyOnDevice,
gmx_wallcycle *  wcycle 
)

Launches first stage of PME on GPU - spreading kernel.

Parameters
[in]pmeThe PME data structure.
[in]xReadyOnDeviceEvent synchronizer indicating that the coordinates are ready in the device memory; nullptr allowed only on separate PME ranks.
[in]wcycleThe wallclock counter.
void pme_gpu_prepare_computation ( gmx_pme_t *  pme,
bool  needToUpdateBox,
const matrix  box,
gmx_wallcycle *  wcycle,
int  flags,
bool  useGpuForceReduction 
)

Prepares PME on GPU computation (updating the box if needed)

Parameters
[in]pmeThe PME data structure.
[in]needToUpdateBoxTells if the stored unit cell parameters should be updated from box.
[in]boxThe unit cell box.
[in]wcycleThe wallclock counter.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]useGpuForceReductionWhether PME forces are reduced on GPU this step or should be downloaded for CPU reduction
void pme_gpu_reinit_computation ( const gmx_pme_t *  pme,
gmx_wallcycle *  wcycle 
)

The PME GPU reinitialization function that is called both at the end of any PME computation and on any load balancing.

Clears the internal grid and energy/virial buffers; it is not safe to start the PME computation without calling this. Note that unlike in the nbnxn module, the force buffer does not need clearing.

Todo:
Rename this function to clear – it clearly only does output resetting and we should be clear about what the function does..
Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
void pme_gpu_reset_timings ( const gmx_pme_t *  pme)

Resets the PME GPU timings. To be called at the reset step.

Parameters
[in]pmeThe PME structure.
void pme_gpu_set_device_x ( const gmx_pme_t *  pme,
DeviceBuffer< float >  d_x 
)

Set pointer to device copy of coordinate data.

Parameters
[in]pmeThe PME data structure.
[in]d_xThe pointer to the positions buffer to be set
bool pme_gpu_supports_build ( std::string *  error)

Checks whether the GROMACS build allows to run PME on GPU. TODO: this partly duplicates an internal PME assert function pme_gpu_check_restrictions(), except that works with a formed gmx_pme_t structure. Should that one go away/work with inputrec?

Parameters
[out]errorIf non-null, the error message when PME is not supported on GPU.
Returns
true if PME can run on GPU on this build, false otherwise.
bool pme_gpu_supports_hardware ( const gmx_hw_info_t &  hwinfo,
std::string *  error 
)

Checks whether the detected (GPU) hardware allows to run PME on GPU.

Parameters
[in]hwinfoInformation about the detected hardware
[out]errorIf non-null, the error message when PME is not supported on GPU.
Returns
true if PME can run on GPU on this build, false otherwise.
bool pme_gpu_supports_input ( const t_inputrec &  ir,
const gmx_mtop_t &  mtop,
std::string *  error 
)

Checks whether the input system allows to run PME on GPU. TODO: this partly duplicates an internal PME assert function pme_gpu_check_restrictions(), except that works with a formed gmx_pme_t structure. Should that one go away/work with inputrec?

Parameters
[in]irInput system.
[in]mtopComplete system topology to check if an FE simulation perturbs charges.
[out]errorIf non-null, the error message if the input is not supported on GPU.
Returns
true if PME can run on GPU with this input, false otherwise.
bool pme_gpu_task_enabled ( const gmx_pme_t *  pme)
inline

Tells if PME is enabled to run on GPU (not necessarily active at the moment).

Todo:
This is a rather static data that should be managed by the hardware assignment manager. For now, it is synonymous with the active PME codepath (in the absence of dynamic switching).
Parameters
[in]pmeThe PME data structure.
Returns
true if PME can run on GPU, false otherwise.
bool pme_gpu_try_finish_task ( gmx_pme_t *  pme,
int  flags,
gmx_wallcycle *  wcycle,
gmx::ForceWithVirial forceWithVirial,
gmx_enerdata_t *  enerd,
GpuTaskCompletion  completionKind 
)

Attempts to complete PME GPU tasks.

The completionKind argument controls whether the function blocks until all PME GPU tasks enqueued completed (as pme_gpu_wait_finish_task() does) or only checks and returns immediately if they did not. When blocking or the tasks have completed it also gets the output forces by assigning the ArrayRef to the forces pointer passed in. Virial/energy are also outputs if they were to be computed.

Parameters
[in]pmeThe PME data structure.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]wcycleThe wallclock counter.
[out]forceWithVirialThe output force and virial
[out]enerdThe output energies
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]completionKindIndicates whether PME task completion should only be checked rather than waited for
Returns
True if the PME GPU tasks have completed
void pme_gpu_wait_and_reduce ( gmx_pme_t *  pme,
int  flags,
gmx_wallcycle *  wcycle,
gmx::ForceWithVirial forceWithVirial,
gmx_enerdata_t *  enerd 
)

Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed).

Parameters
[in]pmeThe PME data structure.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]wcycleThe wallclock counter.
[out]forceWithVirialThe output force and virial
[out]enerdThe output energies
PmeRunMode pme_run_mode ( const gmx_pme_t *  pme)

Returns the active PME codepath (CPU, GPU, mixed).

Todo:
This is a rather static data that should be managed by the higher level task scheduler.
Parameters
[in]pmeThe PME data structure.
Returns
active PME codepath.