Gromacs  2025.0-dev-20241011-013a99c
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Public Member Functions
GpuRegionTimerImpl Class Reference

#include <gromacs/gpu_utils/gpuregiontimer_ocl.h>

Description

The OpenCL implementation of the GPU code region timing. With OpenCL, one has to use cl_event handle for each API call that has to be timed, and accumulate the timing afterwards. As we would like to avoid overhead on API calls, we only query and accumulate cl_event timing at the end of time steps, not after the API calls. Thus, this implementation does not reuse a single cl_event for multiple calls, but instead maintains an array of cl_events to be used within any single code region. The array size is fixed at a small but sufficiently large value for the number of cl_events that might contribute to a timer region, currently 10.

Public Member Functions

 GpuRegionTimerImpl (const GpuRegionTimerImpl &)=delete
 No copying.
 
GpuRegionTimerImploperator= (GpuRegionTimerImpl &&)=delete
 No assignment.
 
 GpuRegionTimerImpl (GpuRegionTimerImpl &&)=delete
 Moving is disabled but can be considered in the future if needed.
 
void openTimingRegion (const DeviceStream &)
 Should be called before the region start.
 
void closeTimingRegion (const DeviceStream &)
 Should be called after the region end.
 
double getLastRangeTime ()
 Returns the last measured region timespan (in milliseconds) and calls reset().
 
void reset ()
 Resets the internal state, releasing the used cl_events.
 
CommandEventfetchNextEvent ()
 Returns a new raw timing event for passing into individual GPU API calls within the region if the API requires it (e.g. on OpenCL).
 

The documentation for this class was generated from the following file: