Gromacs  2024.3
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Public Attributes
NbnxmGpu Struct Reference

#include <gromacs/nbnxm/sycl/nbnxm_sycl_types.h>

+ Collaboration diagram for NbnxmGpu:

Description

Main data structure for CUDA nonbonded force calculations.

Main data structure for SYCL nonbonded force calculations.

Main data structure for OpenCL nonbonded force calculations.

Public Attributes

const DeviceContext * deviceContext_
 GPU device context. More...
 
bool bUseTwoStreams = false
 true if doing both local/non-local NB work on GPU
 
bool bNonLocalStreamDoneMarked = false
 true indicates that the nonlocal_done event was marked
 
NBAtomDataGpuatdat = nullptr
 atom data
 
int * atomIndices = nullptr
 array of atom indices
 
int atomIndicesSize = 0
 size of atom indices
 
int atomIndicesSize_alloc = 0
 size of atom indices allocated in device buffer
 
int * cxy_na = nullptr
 x buf ops num of atoms
 
int ncxy_na = 0
 number of elements in cxy_na
 
int ncxy_na_alloc = 0
 number of elements allocated allocated in device buffer
 
int * cxy_ind = nullptr
 x buf ops cell index mapping
 
int ncxy_ind = 0
 number of elements in cxy_ind
 
int ncxy_ind_alloc = 0
 number of elements allocated allocated in device buffer
 
NBParamGpunbparam = nullptr
 parameters required for the non-bonded calc.
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
Nbnxm::gpu_plist * > 
plist = { { nullptr } }
 pair-list data structures (local and non-local)
 
NBStagingData nbst
 staging area where fshift/energies get downloaded More...
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
const DeviceStream * > 
deviceStreams
 local and non-local GPU streams More...
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
bool > 
haveWork = { { false } }
 True if there is work for the current domain in the respective locality. More...
 
bool bDoTime = false
 True if event-based timing is enabled. More...
 
Nbnxm::GpuTimerstimers = nullptr
 CUDA event-based timers. More...
 
gmx_wallclock_gpu_nbnxn_ttimings = nullptr
 Timing data. TODO: deprecate this and query timers for accumulated data instead. More...
 
struct gmx_device_runtime_data_tdev_rundata = nullptr
 OpenCL runtime data (context, kernels) More...
 
cl_kernel kernel_pruneonly [ePruneNR] = { nullptr }
 prune kernels, ePruneKind defined the kernel kinds
 
bool bPrefetchLjParam = false
 true if prefetching fg i-atom LJ parameters should be used in the kernels More...
 
DeviceBuffer< int > atomIndices
 array of atom indices
 
DeviceBuffer< int > cxy_na
 x buf ops num of atoms
 
DeviceBuffer< int > cxy_ind
 x buf ops cell index mapping
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
bool > 
didPairlistH2D = { { false } }
 true when a pair-list transfer has been done at this step
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
bool > 
didPrune = { { false } }
 true when we we did pruning on this step
 
gmx::EnumerationArray
< Nbnxm::InteractionLocality,
bool > 
didRollingPrune = { { false } }
 true when we did rolling pruning (at the previous step)
 
GpuEventSynchronizer nonlocal_done
 Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed) More...
 
GpuEventSynchronizer misc_ops_and_local_H2D_done
 Event triggered when the tasks issued in the local stream that need to precede the non-local force or buffer operation calculations are done (e.g. f buffer 0-ing, local x/q H2D, buffer op initialization in local stream that is required also by nonlocal stream )
 
cl_kernel kernel_noener_noprune_ptr [Nbnxm::c_numElecTypes][Nbnxm::c_numVdwTypes] = { { nullptr } }
 
cl_kernel kernel_ener_noprune_ptr [Nbnxm::c_numElecTypes][Nbnxm::c_numVdwTypes] = { { nullptr } }
 
cl_kernel kernel_noener_prune_ptr [Nbnxm::c_numElecTypes][Nbnxm::c_numVdwTypes] = { { nullptr } }
 
cl_kernel kernel_ener_prune_ptr [Nbnxm::c_numElecTypes][Nbnxm::c_numVdwTypes] = { { nullptr } }
 
cl_kernel kernel_memset_f = nullptr
 
cl_kernel kernel_memset_f2 = nullptr
 
cl_kernel kernel_memset_f3 = nullptr
 

Member Data Documentation

bool NbnxmGpu::bDoTime = false

True if event-based timing is enabled.

True if event-based timing is enabled. Always false for SYCL.

bool NbnxmGpu::bPrefetchLjParam = false

true if prefetching fg i-atom LJ parameters should be used in the kernels

auxiliary kernels implementing memset-like functions

struct gmx_device_runtime_data_t* NbnxmGpu::dev_rundata = nullptr

OpenCL runtime data (context, kernels)

Pointers to non-bonded kernel functions organized similar with nb_kfunc_xxx arrays in nbnxn_ocl.cpp

const DeviceContext * NbnxmGpu::deviceContext_

GPU device context.

Todo:
Make it constant reference, once NbnxmGpu is a proper class.
Todo:
Make it constant reference, once NbnxmGpu is a proper class.
gmx::EnumerationArray< Nbnxm::InteractionLocality, const DeviceStream * > NbnxmGpu::deviceStreams

local and non-local GPU streams

local and non-local GPU queues

gmx::EnumerationArray< Nbnxm::InteractionLocality, bool > NbnxmGpu::haveWork = { { false } }

True if there is work for the current domain in the respective locality.

True if there has been local/nonlocal GPU work, either bonded or nonbonded, scheduled.

This includes local/nonlocal GPU work, either bonded or nonbonded, scheduled to be executed in the current domain. As long as bonded work is not split up into local/nonlocal, if there is bonded GPU work, both flags will be true.

NBStagingData NbnxmGpu::nbst

staging area where fshift/energies get downloaded

staging area where fshift/energies get downloaded. Will be removed in SYCL.

GpuEventSynchronizer NbnxmGpu::nonlocal_done

Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed)

Events used for synchronization.

Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed)

Nbnxm::GpuTimers * NbnxmGpu::timers = nullptr

CUDA event-based timers.

Dummy timers.

OpenCL event-based timers.

gmx_wallclock_gpu_nbnxn_t * NbnxmGpu::timings = nullptr

Timing data. TODO: deprecate this and query timers for accumulated data instead.

Dummy timing data.


The documentation for this struct was generated from the following files: