Gromacs  2024.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Functions
pme_gpu_program_impl_ocl.cpp File Reference
#include "gmxpre.h"
#include "gromacs/gpu_utils/gmxopencl.h"
#include "gromacs/gpu_utils/ocl_compiler.h"
#include "gromacs/utility/stringutil.h"
#include "pme_gpu_constants.h"
#include "pme_gpu_internal.h"
#include "pme_gpu_program_impl.h"
#include "pme_gpu_types_host.h"
#include "pme_grid.h"
+ Include dependency graph for pme_gpu_program_impl_ocl.cpp:

Description

Implements PmeGpuProgramImpl, which stores permanent PME GPU context-derived data, such as (compiled) kernel handles.

Author
Aleksei Iupinov a.yup.nosp@m.inov.nosp@m.@gmai.nosp@m.l.co.nosp@m.m

Functions

static void checkRequiredWarpSize (cl_kernel kernel, const char *kernelName, const DeviceInformation &deviceInfo)
 Ensure that spread/gather kernels have been compiled to a suitable warp size. More...
 

Function Documentation

static void checkRequiredWarpSize ( cl_kernel  kernel,
const char *  kernelName,
const DeviceInformation deviceInfo 
)
static

Ensure that spread/gather kernels have been compiled to a suitable warp size.

On Intel the exec width/warp is decided at compile-time and can be smaller than the minimum order^2 required in spread/gather ATM which we need to check for.

Due to the one thread per atom and order=4 implementation constraints, order^2 threads should execute without synchronization needed.