Description

The main PME CUDA/OpenCL-specific host data structure, included in the PME GPU structure by the archSpecific pointer.

Public Member Functions
	PmeGpuSpecific (const DeviceContext &deviceContext, const DeviceStream &pmeStream)
	Constructor. More...

Public Attributes
const DeviceContext &	deviceContext_
	A handle to the device context.

const DeviceStream &	pmeStream_
	The GPU stream where everything related to the PME happens.

GpuEventSynchronizer	pmeForcesReady
	Triggered after the PME Force Calculations have been completed.

GpuEventSynchronizer	syncSpreadGridD2H
	Triggered after the grid has been copied to the host (after the spreading stage).

GpuEventSynchronizer	pmeGridsReadyForSpread
	Triggered after the end-of-step tasks in the PME stream are complete. More...

bool	performOutOfPlaceFFT = false
	A boolean which tells whether the complex and real grids for cu/clFFT are different or same. Currently false.

bool	useTiming = false
	A boolean which tells if the GPU timing events are enabled. False by default, can be enabled by setting the environment variable GMX_ENABLE_GPU_TIMING. Note: will not be reliable when multiple GPU tasks are running concurrently on the same device context, as CUDA events on multiple streams are untrustworthy.

std::vector< std::unique_ptr < gmx::Gpu3dFft > >	fftSetup
	Vector of FFT setups.

gmx::EnumerationArray < PmeStage, GpuRegionTimer >	timingEvents
	All the timers one might use.

std::set< PmeStage >	activeTimers
	Indices of timingEvents actually used.

int	localRealGridSize [DIM]
	Local FFT Real-space grid data dimensions.

int	localRealGridSizePadded [DIM]
	Local Real-space grid dimensions (padded).

DeviceBuffer< float >	d_fftRealGrid [2]
	real grid - used in FFT. If single PME rank is used, then it is the same handle as realGrid.

int	forcesSize = 0
	The kernelParams.atoms.forces float element count (actual)

int	forcesSizeAlloc = 0
	The kernelParams.atoms.forces float element count (reserved)

int	gridlineIndicesSize = 0
	The kernelParams.atoms.gridlineIndices int element count (actual)

int	gridlineIndicesSizeAlloc = 0
	The kernelParams.atoms.gridlineIndices int element count (reserved)

int	splineCountActive = 0
	Number of used splines (padded to a full warp).

int	splineDataSize = 0
	Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (actual)

int	splineDataSizeAlloc = 0
	Both the kernelParams.atoms.theta and kernelParams.atoms.dtheta float element count (reserved)

int	coefficientsSize [2] = { 0, 0 }
	The kernelParams.atoms.coefficients float element count (actual)

int	coefficientsCapacity [2] = { 0, 0 }
	The kernelParams.atoms.coefficients float element count (reserved)

int	splineValuesSize [2] = { 0, 0 }
	The kernelParams.grid.splineValuesArray float element count (actual)

int	splineValuesCapacity [2] = { 0, 0 }
	The kernelParams.grid.splineValuesArray float element count (reserved)

int	realGridSize [2] = { 0, 0 }
	The kernelParams.grid.realGrid float element count (actual)

int	realGridCapacity [2] = { 0, 0 }
	The kernelParams.grid.realGrid float element count (reserved)

int	complexGridSize [2] = { 0, 0 }
	The kernelParams.grid.fourierGrid float (not float2!) element count (actual)

Constructor & Destructor Documentation

PmeGpuSpecific::PmeGpuSpecific	(	const DeviceContext &	deviceContext,
		const DeviceStream &	pmeStream
	)

inline

Constructor.

Parameters

[in]	deviceContext	GPU device context
[in]	pmeStream	GPU pme stream.

Member Data Documentation

GpuEventSynchronizer PmeGpuSpecific::pmeGridsReadyForSpread

Triggered after the end-of-step tasks in the PME stream are complete.

Required only in case of GPU PME pipelining, when we launch Spread kernels in separate streams.

The documentation for this struct was generated from the following file:

src/gromacs/ewald/pme_gpu_types_host_impl.h

Description

Public Member Functions

Public Attributes

Constructor & Destructor Documentation

Member Data Documentation