Gromacs  2024.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Public Member Functions | Public Attributes
PmeGpuSpecific Struct Reference

#include <gromacs/ewald/pme_gpu_types_host_impl.h>

+ Collaboration diagram for PmeGpuSpecific:

Description

The main PME CUDA/OpenCL-specific host data structure, included in the PME GPU structure by the archSpecific pointer.

Public Member Functions

 PmeGpuSpecific (const DeviceContext &deviceContext, const DeviceStream &pmeStream)
 Constructor. More...
 

Public Attributes

const DeviceContext & deviceContext_
 A handle to the GPU context. TODO: this is currently extracted from the implementation of pmeGpu->programHandle_, but should be a constructor parameter to PmeGpu, as well as PmeGpuProgram, managed by high-level code.
 
const DeviceStreampmeStream_
 The GPU stream where everything related to the PME happens.
 
GpuEventSynchronizer pmeForcesReady
 Triggered after the PME Force Calculations have been completed.
 
GpuEventSynchronizer syncSpreadGridD2H
 Triggered after the grid has been copied to the host (after the spreading stage).
 
GpuEventSynchronizer pmeGridsReadyForSpread
 Triggered after the end-of-step tasks in the PME stream are complete. More...
 
bool performOutOfPlaceFFT = false
 A boolean which tells whether the complex and real grids for cu/clFFT are different or same. Currently false.
 
bool useTiming = false
 A boolean which tells if the GPU timing events are enabled. False by default, can be enabled by setting the environment variable GMX_ENABLE_GPU_TIMING. Note: will not be reliable when multiple GPU tasks are running concurrently on the same device context, as CUDA events on multiple streams are untrustworthy.
 
std::vector< std::unique_ptr
< gmx::Gpu3dFft > > 
fftSetup
 Vector of FFT setups.
 
gmx::EnumerationArray
< PmeStage, GpuRegionTimer
timingEvents
 All the timers one might use.
 
std::set< PmeStageactiveTimers
 Indices of timingEvents actually used.
 
int localRealGridSize [DIM]
 Local FFT Real-space grid data dimensions.
 
int localRealGridSizePadded [DIM]
 Local Real-space grid dimensions (padded).
 
DeviceBuffer< float > d_fftRealGrid [2]
 real grid - used in FFT. If single PME rank is used, then it is the same handle as realGrid.
 
int forcesSize = 0
 The kernelParams.atoms.forces float element count (actual)
 
int forcesSizeAlloc = 0
 The kernelParams.atoms.forces float element count (reserved)
 
int gridlineIndicesSize = 0
 The kernelParams.atoms.gridlineIndices int element count (actual)
 
int gridlineIndicesSizeAlloc = 0
 The kernelParams.atoms.gridlineIndices int element count (reserved)
 
int splineCountActive = 0
 Number of used splines (padded to a full warp).
 
int splineDataSize = 0
 Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (actual)
 
int splineDataSizeAlloc = 0
 Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (reserved)
 
int coefficientsSize [2] = { 0, 0 }
 The kernelParams.atoms.coefficients float element count (actual)
 
int coefficientsCapacity [2] = { 0, 0 }
 The kernelParams.atoms.coefficients float element count (reserved)
 
int splineValuesSize [2] = { 0, 0 }
 The kernelParams.grid.splineValuesArray float element count (actual)
 
int splineValuesCapacity [2] = { 0, 0 }
 The kernelParams.grid.splineValuesArray float element count (reserved)
 
int realGridSize [2] = { 0, 0 }
 The kernelParams.grid.realGrid float element count (actual)
 
int realGridCapacity [2] = { 0, 0 }
 The kernelParams.grid.realGrid float element count (reserved)
 
int complexGridSize [2] = { 0, 0 }
 The kernelParams.grid.fourierGrid float (not float2!) element count (actual)
 

Constructor & Destructor Documentation

PmeGpuSpecific::PmeGpuSpecific ( const DeviceContext &  deviceContext,
const DeviceStream pmeStream 
)
inline

Constructor.

Parameters
[in]deviceContextGPU device context
[in]pmeStreamGPU pme stream.

Member Data Documentation

GpuEventSynchronizer PmeGpuSpecific::pmeGridsReadyForSpread

Triggered after the end-of-step tasks in the PME stream are complete.

Required only in case of GPU PME pipelining, when we launch Spread kernels in separate streams.


The documentation for this struct was generated from the following file: