Gromacs
2025.0-dev-20241009-5c23d5f
|
#include <gromacs/nbnxm/sycl/nbnxm_sycl_types.h>
Main data structure for CUDA nonbonded force calculations.
Main data structure for SYCL nonbonded force calculations.
Main data structure for OpenCL nonbonded force calculations.
Main data structure for HIP nonbonded force calculations.
Public Attributes | |
const DeviceContext * | deviceContext_ |
GPU device context. More... | |
bool | bUseTwoStreams = false |
true if doing both local/non-local NB work on GPU | |
bool | bNonLocalStreamDoneMarked = false |
true indicates that the nonlocal_done event was marked | |
NBAtomDataGpu * | atdat = nullptr |
atom data | |
int * | atomIndices = nullptr |
array of atom indices | |
int | atomIndicesSize = 0 |
size of atom indices | |
int | atomIndicesSize_alloc = 0 |
size of atom indices allocated in device buffer | |
int * | cxy_na = nullptr |
x buf ops num of atoms | |
int | ncxy_na = 0 |
number of elements in cxy_na | |
int | ncxy_na_alloc = 0 |
number of elements allocated allocated in device buffer | |
int * | cxy_ind = nullptr |
x buf ops cell index mapping | |
int | ncxy_ind = 0 |
number of elements in cxy_ind | |
int | ncxy_ind_alloc = 0 |
number of elements allocated allocated in device buffer | |
NBParamGpu * | nbparam = nullptr |
parameters required for the non-bonded calc. | |
EnumerationArray < InteractionLocality, std::unique_ptr< GpuPairlist > > | plist = { { nullptr } } |
pair-list data structures (local and non-local) | |
NBStagingData | nbst |
staging area where fshift/energies get downloaded More... | |
EnumerationArray < InteractionLocality, const DeviceStream * > | deviceStreams |
local and non-local GPU streams | |
EnumerationArray < InteractionLocality, bool > | haveWork = { { false } } |
True if there is work for the current domain in the respective locality. More... | |
bool | bDoTime = false |
True if event-based timing is enabled. More... | |
GpuTimers * | timers = nullptr |
CUDA event-based timers. More... | |
gmx_wallclock_gpu_nbnxn_t * | timings = nullptr |
Timing data. TODO: deprecate this and query timers for accumulated data instead. More... | |
EnumerationArray < InteractionLocality, bool > | didPairlistH2D = { { false } } |
true when a pair-list transfer has been done at this step | |
EnumerationArray < InteractionLocality, bool > | didPrune = { { false } } |
true when we we did pruning on this step | |
EnumerationArray < InteractionLocality, bool > | didRollingPrune = { { false } } |
true when we did rolling pruning (at the previous step) | |
struct gmx_device_runtime_data_t * | dev_rundata = nullptr |
OpenCL runtime data (context, kernels) More... | |
cl_kernel | kernel_pruneonly [ePruneNR] = { nullptr } |
prune kernels, ePruneKind defined the kernel kinds | |
bool | bPrefetchLjParam = false |
true if prefetching fg i-atom LJ parameters should be used in the kernels More... | |
gmx::EnumerationArray < InteractionLocality, std::unique_ptr< GpuPairlist > > | plist = { nullptr } |
pair-list data structures (local and non-local) | |
DeviceBuffer< int > | atomIndices |
array of atom indices | |
DeviceBuffer< int > | cxy_na |
x buf ops num of atoms | |
DeviceBuffer< int > | cxy_ind |
x buf ops cell index mapping | |
gmx::EnumerationArray < InteractionLocality, const DeviceStream * > | deviceStreams |
local and non-local GPU queues | |
gmx::EnumerationArray < InteractionLocality, bool > | haveWork |
True if there has been local/nonlocal GPU work, either bonded or nonbonded, scheduled. | |
GpuEventSynchronizer | nonlocal_done |
Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed) More... | |
GpuEventSynchronizer | misc_ops_and_local_H2D_done |
Event triggered when the tasks issued in the local stream that need to precede the non-local force or buffer operation calculations are done (e.g. f buffer 0-ing, local x/q H2D, buffer op initialization in local stream that is required also by nonlocal stream ) | |
cl_kernel | kernel_noener_noprune_ptr [c_numElecTypes][c_numVdwTypes] = { { nullptr } } |
cl_kernel | kernel_ener_noprune_ptr [c_numElecTypes][c_numVdwTypes] = { { nullptr } } |
cl_kernel | kernel_noener_prune_ptr [c_numElecTypes][c_numVdwTypes] = { { nullptr } } |
cl_kernel | kernel_ener_prune_ptr [c_numElecTypes][c_numVdwTypes] = { { nullptr } } |
cl_kernel | kernel_memset_f = nullptr |
cl_kernel | kernel_memset_f2 = nullptr |
cl_kernel | kernel_memset_f3 = nullptr |
bool gmx::NbnxmGpu::bDoTime = false |
True if event-based timing is enabled.
True if event-based timing is enabled. Always false for SYCL.
bool gmx::NbnxmGpu::bPrefetchLjParam = false |
true if prefetching fg i-atom LJ parameters should be used in the kernels
auxiliary kernels implementing memset-like functions
struct gmx_device_runtime_data_t* gmx::NbnxmGpu::dev_rundata = nullptr |
OpenCL runtime data (context, kernels)
Pointers to non-bonded kernel functions organized similar with nb_kfunc_xxx arrays in nbnxn_ocl.cpp
const DeviceContext * gmx::NbnxmGpu::deviceContext_ |
EnumerationArray< InteractionLocality, bool > gmx::NbnxmGpu::haveWork = { { false } } |
True if there is work for the current domain in the respective locality.
This includes local/nonlocal GPU work, either bonded or nonbonded, scheduled to be executed in the current domain. As long as bonded work is not split up into local/nonlocal, if there is bonded GPU work, both flags will be true.
NBStagingData gmx::NbnxmGpu::nbst |
staging area where fshift/energies get downloaded
staging area where fshift/energies get downloaded. Will be removed in SYCL.
GpuEventSynchronizer gmx::NbnxmGpu::nonlocal_done |
Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed)
Events used for synchronization.
Event triggered when the non-local non-bonded kernel is done (and the local transfer can proceed)
GpuTimers * gmx::NbnxmGpu::timers = nullptr |
CUDA event-based timers.
Dummy timers.
OpenCL event-based timers.
HIP event-based timers.
gmx_wallclock_gpu_nbnxn_t * gmx::NbnxmGpu::timings = nullptr |
Timing data. TODO: deprecate this and query timers for accumulated data instead.
Dummy timing data.