Gromacs  2020.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Functions
#include "gmxpre.h"
#include "config.h"
#include <list>
#include "gromacs/ewald/ewald_utils.h"
#include "gromacs/ewald/pme.h"
#include "gromacs/fft/parallel_3dfft.h"
#include "gromacs/math/invertmatrix.h"
#include "gromacs/mdlib/gmx_omp_nthreads.h"
#include "gromacs/mdtypes/enerdata.h"
#include "gromacs/mdtypes/forceoutput.h"
#include "gromacs/mdtypes/inputrec.h"
#include "gromacs/utility/exceptions.h"
#include "gromacs/utility/fatalerror.h"
#include "gromacs/utility/gmxassert.h"
#include "gromacs/utility/stringutil.h"
#include "pme_gpu_internal.h"
#include "pme_grid.h"
#include "pme_internal.h"
#include "pme_solve.h"
+ Include dependency graph for pme_gpu.cpp:

Description

Implements high-level PME GPU functions which do not require GPU framework-specific code.

Author
Aleksei Iupinov a.yup.nosp@m.inov.nosp@m.@gmai.nosp@m.l.co.nosp@m.m

Functions

void pme_gpu_reset_timings (const gmx_pme_t *pme)
 Resets the PME GPU timings. To be called at the reset step. More...
 
void pme_gpu_get_timings (const gmx_pme_t *pme, gmx_wallclock_gpu_pme_t *timings)
 Copies the PME GPU timings to the gmx_wallclock_gpu_pme_t structure (for log output). To be called at the run end. More...
 
int pme_gpu_get_padding_size (const gmx_pme_t *pme)
 Returns the size of the padding needed by GPU version of PME in the coordinates array. More...
 
void parallel_3dfft_execute_gpu_wrapper (gmx_pme_t *pme, const int gridIndex, enum gmx_fft_direction dir, gmx_wallcycle_t wcycle)
 A convenience wrapper for launching either the GPU or CPU FFT. More...
 
void pme_gpu_prepare_computation (gmx_pme_t *pme, bool needToUpdateBox, const matrix box, gmx_wallcycle *wcycle, int flags, bool useGpuForceReduction)
 Prepares PME on GPU computation (updating the box if needed) More...
 
void pme_gpu_launch_spread (gmx_pme_t *pme, GpuEventSynchronizer *xReadyOnDevice, gmx_wallcycle *wcycle)
 Launches first stage of PME on GPU - spreading kernel. More...
 
void pme_gpu_launch_complex_transforms (gmx_pme_t *pme, gmx_wallcycle *wcycle)
 Launches middle stages of PME (FFT R2C, solving, FFT C2R) either on GPU or on CPU, depending on the run mode. More...
 
void pme_gpu_launch_gather (const gmx_pme_t *pme, gmx_wallcycle *wcycle, PmeForceOutputHandling forceTreatment)
 Launches last stage of PME on GPU - force gathering and D2H force transfer. More...
 
static void sum_forces (gmx::ArrayRef< gmx::RVec > f, gmx::ArrayRef< const gmx::RVec > forceToAdd)
 Accumulate the forcesToAdd to f, using the available threads.
 
static void pme_gpu_reduce_outputs (const int flags, const PmeOutput &output, gmx_wallcycle *wcycle, gmx::ForceWithVirial *forceWithVirial, gmx_enerdata_t *enerd)
 Reduce quantities from output to forceWithVirial and enerd.
 
bool pme_gpu_try_finish_task (gmx_pme_t *pme, const int flags, gmx_wallcycle *wcycle, gmx::ForceWithVirial *forceWithVirial, gmx_enerdata_t *enerd, GpuTaskCompletion completionKind)
 Attempts to complete PME GPU tasks. More...
 
PmeOutput pme_gpu_wait_finish_task (gmx_pme_t *pme, const int flags, gmx_wallcycle *wcycle)
 Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed). More...
 
void pme_gpu_wait_and_reduce (gmx_pme_t *pme, const int flags, gmx_wallcycle *wcycle, gmx::ForceWithVirial *forceWithVirial, gmx_enerdata_t *enerd)
 Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed). More...
 
void pme_gpu_reinit_computation (const gmx_pme_t *pme, gmx_wallcycle *wcycle)
 The PME GPU reinitialization function that is called both at the end of any PME computation and on any load balancing. More...
 
DeviceBuffer< float > pme_gpu_get_device_x (const gmx_pme_t *pme)
 Get pointer to device copy of coordinate data. More...
 
void * pme_gpu_get_device_f (const gmx_pme_t *pme)
 Get pointer to device copy of force data. More...
 
void pme_gpu_set_device_x (const gmx_pme_t *pme, DeviceBuffer< float > d_x)
 Set pointer to device copy of coordinate data. More...
 
void * pme_gpu_get_device_stream (const gmx_pme_t *pme)
 Returns the pointer to the GPU stream. More...
 
void * pme_gpu_get_device_context (const gmx_pme_t *pme)
 Returns the pointer to the GPU context. More...
 
GpuEventSynchronizerpme_gpu_get_f_ready_synchronizer (const gmx_pme_t *pme)
 Get pointer to the device synchronizer object that allows syncing on PME force calculation completion. More...
 

Function Documentation

void parallel_3dfft_execute_gpu_wrapper ( gmx_pme_t *  pme,
const int  gridIndex,
enum gmx_fft_direction  dir,
gmx_wallcycle_t  wcycle 
)
inline

A convenience wrapper for launching either the GPU or CPU FFT.

Parameters
[in]pmeThe PME structure.
[in]gridIndexThe grid index - should currently always be 0.
[in]dirThe FFT direction enum.
[in]wcycleThe wallclock counter.
void* pme_gpu_get_device_context ( const gmx_pme_t *  pme)

Returns the pointer to the GPU context.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to GPU context object.
void* pme_gpu_get_device_f ( const gmx_pme_t *  pme)

Get pointer to device copy of force data.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to force data
void* pme_gpu_get_device_stream ( const gmx_pme_t *  pme)

Returns the pointer to the GPU stream.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to GPU stream object.
DeviceBuffer<float> pme_gpu_get_device_x ( const gmx_pme_t *  pme)

Get pointer to device copy of coordinate data.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to coordinate data
GpuEventSynchronizer* pme_gpu_get_f_ready_synchronizer ( const gmx_pme_t *  pme)

Get pointer to the device synchronizer object that allows syncing on PME force calculation completion.

Parameters
[in]pmeThe PME data structure.
Returns
Pointer to sychronizer
int pme_gpu_get_padding_size ( const gmx_pme_t *  pme)

Returns the size of the padding needed by GPU version of PME in the coordinates array.

Parameters
[in]pmeThe PME data structure.
void pme_gpu_get_timings ( const gmx_pme_t *  pme,
gmx_wallclock_gpu_pme_t timings 
)

Copies the PME GPU timings to the gmx_wallclock_gpu_pme_t structure (for log output). To be called at the run end.

Parameters
[in]pmeThe PME structure.
[in]timingsThe gmx_wallclock_gpu_pme_t structure.
void pme_gpu_launch_complex_transforms ( gmx_pme_t *  pme,
gmx_wallcycle *  wcycle 
)

Launches middle stages of PME (FFT R2C, solving, FFT C2R) either on GPU or on CPU, depending on the run mode.

Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
void pme_gpu_launch_gather ( const gmx_pme_t *  pme,
gmx_wallcycle *  wcycle,
PmeForceOutputHandling  forceTreatment 
)

Launches last stage of PME on GPU - force gathering and D2H force transfer.

Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
[in]forceTreatmentTells how data should be treated. The gathering kernel either stores the output reciprocal forces into the host array, or copies its contents to the GPU first and accumulates. The reduction is non-atomic.
void pme_gpu_launch_spread ( gmx_pme_t *  pme,
GpuEventSynchronizer xReadyOnDevice,
gmx_wallcycle *  wcycle 
)

Launches first stage of PME on GPU - spreading kernel.

Parameters
[in]pmeThe PME data structure.
[in]xReadyOnDeviceEvent synchronizer indicating that the coordinates are ready in the device memory; nullptr allowed only on separate PME ranks.
[in]wcycleThe wallclock counter.
void pme_gpu_prepare_computation ( gmx_pme_t *  pme,
bool  needToUpdateBox,
const matrix  box,
gmx_wallcycle *  wcycle,
int  flags,
bool  useGpuForceReduction 
)

Prepares PME on GPU computation (updating the box if needed)

Parameters
[in]pmeThe PME data structure.
[in]needToUpdateBoxTells if the stored unit cell parameters should be updated from box.
[in]boxThe unit cell box.
[in]wcycleThe wallclock counter.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]useGpuForceReductionWhether PME forces are reduced on GPU this step or should be downloaded for CPU reduction
void pme_gpu_reinit_computation ( const gmx_pme_t *  pme,
gmx_wallcycle *  wcycle 
)

The PME GPU reinitialization function that is called both at the end of any PME computation and on any load balancing.

Clears the internal grid and energy/virial buffers; it is not safe to start the PME computation without calling this. Note that unlike in the nbnxn module, the force buffer does not need clearing.

Todo:
Rename this function to clear – it clearly only does output resetting and we should be clear about what the function does..
Parameters
[in]pmeThe PME data structure.
[in]wcycleThe wallclock counter.
void pme_gpu_reset_timings ( const gmx_pme_t *  pme)

Resets the PME GPU timings. To be called at the reset step.

Parameters
[in]pmeThe PME structure.
void pme_gpu_set_device_x ( const gmx_pme_t *  pme,
DeviceBuffer< float >  d_x 
)

Set pointer to device copy of coordinate data.

Parameters
[in]pmeThe PME data structure.
[in]d_xThe pointer to the positions buffer to be set
bool pme_gpu_try_finish_task ( gmx_pme_t *  pme,
int  flags,
gmx_wallcycle *  wcycle,
gmx::ForceWithVirial forceWithVirial,
gmx_enerdata_t *  enerd,
GpuTaskCompletion  completionKind 
)

Attempts to complete PME GPU tasks.

The completionKind argument controls whether the function blocks until all PME GPU tasks enqueued completed (as pme_gpu_wait_finish_task() does) or only checks and returns immediately if they did not. When blocking or the tasks have completed it also gets the output forces by assigning the ArrayRef to the forces pointer passed in. Virial/energy are also outputs if they were to be computed.

Parameters
[in]pmeThe PME data structure.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]wcycleThe wallclock counter.
[out]forceWithVirialThe output force and virial
[out]enerdThe output energies
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]completionKindIndicates whether PME task completion should only be checked rather than waited for
Returns
True if the PME GPU tasks have completed
void pme_gpu_wait_and_reduce ( gmx_pme_t *  pme,
int  flags,
gmx_wallcycle *  wcycle,
gmx::ForceWithVirial forceWithVirial,
gmx_enerdata_t *  enerd 
)

Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed).

Parameters
[in]pmeThe PME data structure.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[in]wcycleThe wallclock counter.
[out]forceWithVirialThe output force and virial
[out]enerdThe output energies
PmeOutput pme_gpu_wait_finish_task ( gmx_pme_t *  pme,
int  flags,
gmx_wallcycle *  wcycle 
)

Blocks until PME GPU tasks are completed, and gets the output forces and virial/energy (if they were to be computed).

Parameters
[in]pmeThe PME data structure.
[in]flagsThe combination of flags to affect this PME computation. The flags are the GMX_PME_ flags from pme.h.
[out]wcycleThe wallclock counter.
Returns
The output forces, energy and virial