Gromacs  2020.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Enumerations | Functions
oclutils.h File Reference
#include <string>
#include "gromacs/gpu_utils/gmxopencl.h"
#include "gromacs/gpu_utils/gputraits_ocl.h"
#include "gromacs/utility/exceptions.h"
#include "gromacs/utility/gmxassert.h"
+ Include dependency graph for oclutils.h:
+ This graph shows which files directly or indirectly include this file:

Description

Declare utility routines for OpenCL.

Author
Anca Hamuraru anca@.nosp@m.stre.nosp@m.amcom.nosp@m.puti.nosp@m.ng.eu

Classes

struct  ocl_gpu_id_t
 OpenCL GPU device identificator. More...
 
struct  gmx_device_info_t
 OpenCL device information. More...
 
struct  gmx_device_runtime_data_t
 OpenCL GPU runtime data. More...
 

Enumerations

enum  ocl_vendor_id_t { OCL_VENDOR_NVIDIA = 0, OCL_VENDOR_AMD, OCL_VENDOR_INTEL, OCL_VENDOR_UNKNOWN }
 OpenCL vendor IDs.
 

Functions

int ocl_copy_D2H (void *h_dest, cl_mem d_src, size_t offset, size_t bytes, GpuApiCallBehavior transferKind, cl_command_queue command_queue, cl_event *copy_event)
 Launches synchronous or asynchronous device to host memory copy. More...
 
int ocl_copy_D2H_async (void *h_dest, cl_mem d_src, size_t offset, size_t bytes, cl_command_queue command_queue, cl_event *copy_event)
 Launches asynchronous device to host memory copy. More...
 
int ocl_copy_H2D (cl_mem d_dest, const void *h_src, size_t offset, size_t bytes, GpuApiCallBehavior transferKind, cl_command_queue command_queue, cl_event *copy_event)
 Launches synchronous or asynchronous host to device memory copy. More...
 
int ocl_copy_H2D_async (cl_mem d_dest, const void *h_src, size_t offset, size_t bytes, cl_command_queue command_queue, cl_event *copy_event)
 Launches asynchronous host to device memory copy. More...
 
int ocl_copy_H2D_sync (cl_mem d_dest, const void *h_src, size_t offset, size_t bytes, cl_command_queue command_queue)
 Launches synchronous host to device memory copy.
 
void pmalloc (void **h_ptr, size_t nbytes)
 Allocate host memory in malloc style. More...
 
void pfree (void *h_ptr)
 Free host memory in malloc style. More...
 
std::string ocl_get_error_string (cl_int error)
 Convert error code to diagnostic string.
 
static void gpuStreamSynchronize (cl_command_queue s)
 Calls clFinish() in the stream s. More...
 
void ensureReferenceCount (const cl_event &event, unsigned int refCount)
 A debug checker to track cl_events being released correctly.
 
static bool haveStreamTasksCompleted (cl_command_queue s)
 Pretend to synchronize an OpenCL stream (dummy implementation). More...
 
void prepareGpuKernelArgument (cl_kernel kernel, const KernelLaunchConfig &config, size_t argIndex)
 A function for setting up a single OpenCL kernel argument. This is the tail of the compile-time recursive function below. It has to be seen by the compiler first. As NB kernels might be using dynamic local memory as the last argument, this function also manages that, using sharedMemorySize from config. More...
 
template<typename CurrentArg , typename... RemainingArgs>
void prepareGpuKernelArgument (cl_kernel kernel, const KernelLaunchConfig &config, size_t argIndex, const CurrentArg *argPtr, const RemainingArgs *...otherArgsPtrs)
 Compile-time recursive function for setting up a single OpenCL kernel argument. This function uses one kernel argument pointer argPtr to call clSetKernelArg(), and calls itself on the next argument, eventually calling the tail function above. More...
 
template<typename... Args>
void * prepareGpuKernelArguments (cl_kernel kernel, const KernelLaunchConfig &config, const Args *...argsPtrs)
 A wrapper function for setting up all the OpenCL kernel arguments. Calls the recursive functions above. More...
 
void launchGpuKernel (cl_kernel kernel, const KernelLaunchConfig &config, CommandEvent *timingEvent, const char *kernelName, const void *)
 Launches the OpenCL kernel and handles the errors. More...
 

Function Documentation

static void gpuStreamSynchronize ( cl_command_queue  s)
inlinestatic

Calls clFinish() in the stream s.

Parameters
[in]sstream to synchronize with
static bool haveStreamTasksCompleted ( cl_command_queue  s)
inlinestatic

Pretend to synchronize an OpenCL stream (dummy implementation).

Parameters
[in]squeue to check
Returns
True if all tasks enqueued in the stream s (at the time of this call) have completed.
void launchGpuKernel ( cl_kernel  kernel,
const KernelLaunchConfig config,
CommandEvent timingEvent,
const char *  kernelName,
const void *   
)
inline

Launches the OpenCL kernel and handles the errors.

Parameters
[in]kernelKernel function handle
[in]configKernel configuration for launching
[in]timingEventTiming event, fetched from GpuRegionTimer
[in]kernelNameHuman readable kernel description, for error handling only
Exceptions
gmx::InternalErroron kernel launch failure
int ocl_copy_D2H ( void *  h_dest,
cl_mem  d_src,
size_t  offset,
size_t  bytes,
GpuApiCallBehavior  transferKind,
cl_command_queue  command_queue,
cl_event *  copy_event 
)

Launches synchronous or asynchronous device to host memory copy.

If copy_event is not NULL, on return it will contain an event object identifying this particular device to host operation. The event can further be used to queue a wait for this operation or to query profiling information.

int ocl_copy_D2H_async ( void *  h_dest,
cl_mem  d_src,
size_t  offset,
size_t  bytes,
cl_command_queue  command_queue,
cl_event *  copy_event 
)

Launches asynchronous device to host memory copy.

If copy_event is not nullptr, on return it will contain an event object identifying this particular host to device operation. The event can further be used to queue a wait for this operation or to query profiling information.

int ocl_copy_H2D ( cl_mem  d_dest,
const void *  h_src,
size_t  offset,
size_t  bytes,
GpuApiCallBehavior  transferKind,
cl_command_queue  command_queue,
cl_event *  copy_event 
)

Launches synchronous or asynchronous host to device memory copy.

If copy_event is not NULL, on return it will contain an event object identifying this particular host to device operation. The event can further be used to queue a wait for this operation or to query profiling information.

int ocl_copy_H2D_async ( cl_mem  d_dest,
const void *  h_src,
size_t  offset,
size_t  bytes,
cl_command_queue  command_queue,
cl_event *  copy_event 
)

Launches asynchronous host to device memory copy.

If copy_event is not nullptr, on return it will contain an event object identifying this particular host to device operation. The event can further be used to queue a wait for this operation or to query profiling information.

void pfree ( void *  h_ptr)

Free host memory in malloc style.

Free host memory in malloc style.

Parameters
[in]h_ptrBuffer allocated with pmalloc that needs to be freed.
void pmalloc ( void **  h_ptr,
size_t  nbytes 
)

Allocate host memory in malloc style.

Allocate host memory in malloc style.

Todo:
This function should allocate page-locked memory to help reduce D2H and H2D transfer times, similar with pmalloc from pmalloc_cuda.cu.
Parameters
[in,out]h_ptrPointer where to store the address of the newly allocated buffer.
[in]nbytesSize in bytes of the buffer to be allocated.
void prepareGpuKernelArgument ( cl_kernel  kernel,
const KernelLaunchConfig config,
size_t  argIndex 
)
inline

A function for setting up a single OpenCL kernel argument. This is the tail of the compile-time recursive function below. It has to be seen by the compiler first. As NB kernels might be using dynamic local memory as the last argument, this function also manages that, using sharedMemorySize from config.

Parameters
[in]kernelKernel function handle
[in]configKernel configuration for launching
[in]argIndexIndex of the current argument
template<typename CurrentArg , typename... RemainingArgs>
void prepareGpuKernelArgument ( cl_kernel  kernel,
const KernelLaunchConfig config,
size_t  argIndex,
const CurrentArg *  argPtr,
const RemainingArgs *...  otherArgsPtrs 
)

Compile-time recursive function for setting up a single OpenCL kernel argument. This function uses one kernel argument pointer argPtr to call clSetKernelArg(), and calls itself on the next argument, eventually calling the tail function above.

Template Parameters
CurrentArgType of the current argument
RemainingArgsTypes of remaining arguments after the current one
Parameters
[in]kernelKernel function handle
[in]configKernel configuration for launching
[in]argIndexIndex of the current argument
[in]argPtrPointer to the current argument
[in]otherArgsPtrsPack of pointers to arguments remaining to process after the current one
template<typename... Args>
void* prepareGpuKernelArguments ( cl_kernel  kernel,
const KernelLaunchConfig config,
const Args *...  argsPtrs 
)

A wrapper function for setting up all the OpenCL kernel arguments. Calls the recursive functions above.

Template Parameters
ArgsTypes of all the kernel arguments
Parameters
[in]kernelKernel function handle
[in]configKernel configuration for launching
[in]argsPtrsPointers to all the kernel arguments
Returns
A handle for the prepared parameter pack to be used with launchGpuKernel() as the last argument
  • currently always nullptr for OpenCL, as it manages kernel/arguments association by itself.