Gromacs
2016.6
|
#include "gromacs/gpu_utils/gpu_macros.h"
#include "gromacs/math/vectypes.h"
#include "gromacs/mdlib/nbnxn_gpu_types.h"
#include "gromacs/utility/basedefinitions.h"
#include "gromacs/utility/real.h"
Declare interface for GPU execution for NBNXN module.
Functions | |
void | nbnxn_gpu_launch_kernel (gmx_nbnxn_gpu_t *nb, const struct nbnxn_atomdata_t *nbdata, int flags, int iloc) |
Launch asynchronously the nonbonded force calculations. More... | |
void | nbnxn_gpu_launch_cpyback (gmx_nbnxn_gpu_t *nb, const struct nbnxn_atomdata_t *nbatom, int flags, int aloc) |
Launch asynchronously the download of nonbonded forces from the GPU (and energies/shift forces if required). | |
void | nbnxn_gpu_wait_for_gpu (gmx_nbnxn_gpu_t *nb, int flags, int aloc, real *e_lj, real *e_el, rvec *fshift) |
Wait for the asynchronously launched nonbonded calculations and data transfers to finish. | |
int | nbnxn_gpu_pick_ewald_kernel_type (bool bTwinCut) |
Selects the Ewald kernel type, analytical or tabulated, single or twin cut-off. | |
void nbnxn_gpu_launch_kernel | ( | gmx_nbnxn_ocl_t * | nb, |
const struct nbnxn_atomdata_t * | nbatom, | ||
int | flags, | ||
int | iloc | ||
) |
Launch asynchronously the nonbonded force calculations.
This consists of the following (async) steps launched:
Launch asynchronously the nonbonded force calculations.
As we execute nonbonded workload in separate queues, before launching the kernel we need to make sure that he following operations have completed:
These operations are issued in the local queue at the beginning of the step and therefore always complete before the local kernel launch. The non-local kernel is launched after the local on the same device/context, so this is inherently scheduled after the operations in the local stream (including the above "misc_ops"). However, for the sake of having a future-proof implementation, we use the misc_ops_done event to record the point in time when the above operations are finished and synchronize with this event in the non-local stream.