Description

The main PME GPU host structure, included in the PME CPU structure by pointer.

Public Attributes
std::shared_ptr< PmeShared >	common
	The information copied once per reinit from the CPU structure.

const PmeGpuProgram *	programHandle_
	A handle to the program created by buildPmeGpuProgram()

std::unique_ptr < gmx::ClfftInitializer >	initializedClfftLibrary_
	Handle that ensures the clFFT library has been initialized once per process.

PmeGpuSettings	settings
	The settings.

PmeGpuStaging	staging
	The host-side buffers. The device-side buffers are buried in kernelParams, but that will have to change.

int	nAtomsAlloc
	Number of local atoms, padded to be divisible by c_pmeAtomDataAlignment. More...

std::intmax_t	maxGridWidthX
	Kernel scheduling grid width limit in X - derived from deviceinfo compute capability in CUDA. Declared as very large int to make it useful in computations with type promotion, to avoid overflows. OpenCL seems to not have readily available global work size limit, so we just assign a large arbitrary constant to this instead. TODO: this should be in PmeGpuProgram(Impl)

int	minParticleCountToRecalculateSplines = 23000
	Minimum particle count to prefer recalculating splines. More...

std::shared_ptr < PmeGpuKernelParams >	kernelParams
	A single structure encompassing all the PME data used on GPU. Its value is the only argument to all the PME GPU kernels. More...

std::shared_ptr< PmeGpuSpecific >	archSpecific
	The pointer to GPU-framework specific host-side data, such as CUDA streams and events.

std::unique_ptr < PmeGpuHaloExchange >	haloExchange
	The pointer to PME halo-exchange specific host-side data.

bool	useNvshmem = false

std::unique_ptr< PmeNvshmemHost >	nvshmemParams

Member Data Documentation

std::shared_ptr<PmeGpuKernelParams> PmeGpu::kernelParams

A single structure encompassing all the PME data used on GPU. Its value is the only argument to all the PME GPU kernels.

Todo:: Test whether this should be copied to the constant GPU memory once for each computation (or even less often with no box updates) instead of being an argument.

int PmeGpu::minParticleCountToRecalculateSplines = 23000

Minimum particle count to prefer recalculating splines.

The gather kernel can either recalculate the splines or load those saved during the spline (and spread) kernel. Recalculating is advantageous when there are enough particles. When so doing, it is best to use fewer threads per atom in the spline and spread.

This feature is supported by CUDA and SYCL. Spread pipelining requires spline recalculation.

int PmeGpu::nAtomsAlloc

Number of local atoms, padded to be divisible by c_pmeAtomDataAlignment.

Used only as a basic size for almost all the atom data allocations (spline parameter data is also aligned by PME_SPREADGATHER_PARTICLES_PER_WARP). kernelParams.atoms.nAtoms is the actual atom count to be used for most data copying.

TODO: memory allocation/padding properties should be handled by something like a container

The documentation for this struct was generated from the following file:

src/gromacs/ewald/pme_gpu_types_host.h

Description

Public Attributes

Member Data Documentation