Gromacs
2025.0-dev-20241011-013a99c
|
#include <gromacs/ewald/pme_gpu_types_host_impl.h>
The main PME CUDA/OpenCL-specific host data structure, included in the PME GPU structure by the archSpecific pointer.
Public Member Functions | |
PmeGpuSpecific (const DeviceContext &deviceContext, const DeviceStream &pmeStream) | |
Constructor. More... | |
Public Attributes | |
const DeviceContext & | deviceContext_ |
A handle to the device context. | |
const DeviceStream & | pmeStream_ |
The GPU stream where everything related to the PME happens. | |
GpuEventSynchronizer | pmeForcesReady |
Triggered after the PME Force Calculations have been completed. | |
GpuEventSynchronizer | syncSpreadGridD2H |
Triggered after the grid has been copied to the host (after the spreading stage). | |
GpuEventSynchronizer | pmeGridsReadyForSpread |
Triggered after the end-of-step tasks in the PME stream are complete. More... | |
bool | performOutOfPlaceFFT = false |
A boolean which tells whether the complex and real grids for cu/clFFT are different or same. Currently false. | |
bool | useTiming = false |
A boolean which tells if the GPU timing events are enabled. False by default, can be enabled by setting the environment variable GMX_ENABLE_GPU_TIMING. Note: will not be reliable when multiple GPU tasks are running concurrently on the same device context, as CUDA events on multiple streams are untrustworthy. | |
std::vector< std::unique_ptr < gmx::Gpu3dFft > > | fftSetup |
Vector of FFT setups. | |
gmx::EnumerationArray < PmeStage, GpuRegionTimer > | timingEvents |
All the timers one might use. | |
std::set< PmeStage > | activeTimers |
Indices of timingEvents actually used. | |
int | localRealGridSize [DIM] |
Local FFT Real-space grid data dimensions. | |
int | localRealGridSizePadded [DIM] |
Local Real-space grid dimensions (padded). | |
DeviceBuffer< float > | d_fftRealGrid [2] |
real grid - used in FFT. If single PME rank is used, then it is the same handle as realGrid. | |
int | forcesSize = 0 |
The kernelParams.atoms.forces float element count (actual) | |
int | forcesSizeAlloc = 0 |
The kernelParams.atoms.forces float element count (reserved) | |
int | gridlineIndicesSize = 0 |
The kernelParams.atoms.gridlineIndices int element count (actual) | |
int | gridlineIndicesSizeAlloc = 0 |
The kernelParams.atoms.gridlineIndices int element count (reserved) | |
int | splineCountActive = 0 |
Number of used splines (padded to a full warp). | |
int | splineDataSize = 0 |
Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (actual) | |
int | splineDataSizeAlloc = 0 |
Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (reserved) | |
int | coefficientsSize [2] = { 0, 0 } |
The kernelParams.atoms.coefficients float element count (actual) | |
int | coefficientsCapacity [2] = { 0, 0 } |
The kernelParams.atoms.coefficients float element count (reserved) | |
int | splineValuesSize [2] = { 0, 0 } |
The kernelParams.grid.splineValuesArray float element count (actual) | |
int | splineValuesCapacity [2] = { 0, 0 } |
The kernelParams.grid.splineValuesArray float element count (reserved) | |
int | realGridSize [2] = { 0, 0 } |
The kernelParams.grid.realGrid float element count (actual) | |
int | realGridCapacity [2] = { 0, 0 } |
The kernelParams.grid.realGrid float element count (reserved) | |
int | complexGridSize [2] = { 0, 0 } |
The kernelParams.grid.fourierGrid float (not float2!) element count (actual) | |
|
inline |
Constructor.
[in] | deviceContext | GPU device context |
[in] | pmeStream | GPU pme stream. |
GpuEventSynchronizer PmeGpuSpecific::pmeGridsReadyForSpread |
Triggered after the end-of-step tasks in the PME stream are complete.
Required only in case of GPU PME pipelining, when we launch Spread kernels in separate streams.