|
Gromacs
2026.0-dev-20251106-2ba968f
|
#include <gromacs/ewald/pme_gpu_types_host_impl.h>
Collaboration diagram for PmeGpuSpecific:The main PME CUDA/OpenCL-specific host data structure, included in the PME GPU structure by the archSpecific pointer.
Public Member Functions | |
| PmeGpuSpecific (const DeviceContext &deviceContext, const DeviceStream &pmeStream) | |
| Constructor. More... | |
Public Attributes | |
| const DeviceContext & | deviceContext_ |
| A handle to the device context. | |
| const DeviceStream & | pmeStream_ |
| The GPU stream where everything related to the PME happens. | |
| GpuEventSynchronizer | pmeForcesReady |
| Triggered after the PME Force Calculations have been completed. | |
| GpuEventSynchronizer | syncSpreadGridD2H |
| Triggered after the grid has been copied to the host (after the spreading stage). | |
| GpuEventSynchronizer | pmeGridsReadyForSpread |
| Triggered after the end-of-step tasks in the PME stream are complete. More... | |
| bool | performOutOfPlaceFFT = false |
| A boolean which tells whether the complex and real grids for cu/clFFT are different or same. Currently false. | |
| bool | useTiming = false |
| A boolean which tells if the GPU timing events are enabled. False by default, can be enabled by setting the environment variable GMX_ENABLE_GPU_TIMING. Note: will not be reliable when multiple GPU tasks are running concurrently on the same device context, as CUDA events on multiple streams are untrustworthy. | |
|
std::vector< std::unique_ptr < gmx::Gpu3dFft > > | fftSetup |
| Vector of FFT setups. | |
|
gmx::EnumerationArray < PmeStage, GpuRegionTimer > | timingEvents |
| All the timers one might use. | |
| std::set< PmeStage > | activeTimers |
| Indices of timingEvents actually used. | |
| int | localRealGridSize [DIM] |
| Local FFT Real-space grid data dimensions. | |
| int | localRealGridSizePadded [DIM] |
| Local Real-space grid dimensions (padded). | |
| DeviceBuffer< float > | d_fftRealGrid [2] |
| real grid - used in FFT. If single PME rank is used, then it is the same handle as realGrid. | |
| int | forcesSize = 0 |
| The kernelParams.atoms.forces float element count (actual) | |
| int | forcesSizeAlloc = 0 |
| The kernelParams.atoms.forces float element count (reserved) | |
| int | gridlineIndicesSize = 0 |
| The kernelParams.atoms.gridlineIndices int element count (actual) | |
| int | gridlineIndicesSizeAlloc = 0 |
| The kernelParams.atoms.gridlineIndices int element count (reserved) | |
| int | splineCountActive = 0 |
| Number of used splines (padded to a full warp). | |
| int | splineDataSize = 0 |
| Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (actual) | |
| int | splineDataSizeAlloc = 0 |
| Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (reserved) | |
| int | coefficientsSize [2] = { 0, 0 } |
| The kernelParams.atoms.coefficients float element count (actual) | |
| int | coefficientsCapacity [2] = { 0, 0 } |
| The kernelParams.atoms.coefficients float element count (reserved) | |
| int | splineValuesSize [2] = { 0, 0 } |
| The kernelParams.grid.splineValuesArray float element count (actual) | |
| int | splineValuesCapacity [2] = { 0, 0 } |
| The kernelParams.grid.splineValuesArray float element count (reserved) | |
| int | realGridSize [2] = { 0, 0 } |
| The kernelParams.grid.realGrid float element count (actual) | |
| int | realGridCapacity [2] = { 0, 0 } |
| The kernelParams.grid.realGrid float element count (reserved) | |
| int | complexGridSize [2] = { 0, 0 } |
| The kernelParams.grid.fourierGrid float (not float2!) element count (actual) | |
|
inline |
Constructor.
| [in] | deviceContext | GPU device context |
| [in] | pmeStream | GPU pme stream. |
| GpuEventSynchronizer PmeGpuSpecific::pmeGridsReadyForSpread |
Triggered after the end-of-step tasks in the PME stream are complete.
Required only in case of GPU PME pipelining, when we launch Spread kernels in separate streams.
1.8.5