Implements the DeviceBuffer type and routines for SYCL. Should only be included directly by the main DeviceBuffer file devicebuffer.h. TODO: the intent is for DeviceBuffer to become a class.
- Author
- Artem Zhmurov zhmur.nosp@m.ov@g.nosp@m.mail..nosp@m.com
-
Erik Lindahl erik..nosp@m.lind.nosp@m.ahl@g.nosp@m.mail.nosp@m..com
-
Andrey Alekseenko al42a.nosp@m.nd@g.nosp@m.mail..nosp@m.com
|
template<typename T > |
static gmx_unused bool | checkDeviceBuffer (const DeviceBuffer< T > &buffer, int gmx_unused requiredSize) |
| Check the validity of the device buffer. More...
|
|
template<typename ValueType > |
void | allocateDeviceBuffer (DeviceBuffer< ValueType > *buffer, size_t numValues, const DeviceContext &deviceContext) |
| Allocates a device-side buffer. It is currently a caller's responsibility to call it only on not-yet allocated buffers. More...
|
|
template<typename ValueType > |
void | freeDeviceBuffer (DeviceBuffer< ValueType > *buffer) |
| Frees a device-side buffer. This does not reset separately stored size/capacity integers, as this is planned to be a destructor of DeviceBuffer as a proper class, and no calls on buffer should be made afterwards. More...
|
|
template<typename ValueType > |
void | copyToDeviceBuffer (DeviceBuffer< ValueType > *buffer, const ValueType *hostBuffer, size_t startingOffset, size_t numValues, const DeviceStream &deviceStream, GpuApiCallBehavior transferKind, CommandEvent *gmx_unused timingEvent) |
| Performs the host-to-device data copy, synchronous or asynchronously on request. More...
|
|
template<typename ValueType > |
void | copyFromDeviceBuffer (ValueType *hostBuffer, DeviceBuffer< ValueType > *buffer, size_t startingOffset, size_t numValues, const DeviceStream &deviceStream, GpuApiCallBehavior transferKind, CommandEvent *gmx_unused timingEvent) |
| Performs the device-to-host data copy, synchronous or asynchronously on request. More...
|
|
template<typename ValueType > |
void | copyBetweenDeviceBuffers (DeviceBuffer< ValueType > *destinationDeviceBuffer, DeviceBuffer< ValueType > *sourceDeviceBuffer, size_t numValues, const DeviceStream &deviceStream, GpuApiCallBehavior transferKind, CommandEvent *gmx_unused timingEvent) |
| Performs the device-to-device data copy, synchronous or asynchronously on request. More...
|
|
template<typename ValueType > |
void | clearDeviceBufferAsync (DeviceBuffer< ValueType > *buffer, size_t startingOffset, size_t numValues, const DeviceStream &deviceStream) |
| Clears the device buffer asynchronously. More...
|
|
template<typename ValueType > |
void | initParamLookupTable (DeviceBuffer< ValueType > *deviceBuffer, DeviceTexture *, const ValueType *hostBuffer, int numValues, const DeviceContext &deviceContext, const DeviceStream &deviceStream) |
| Create a texture object for an array of type ValueType. More...
|
|
template<typename ValueType > |
void | destroyParamLookupTable (DeviceBuffer< ValueType > *deviceBuffer, DeviceTexture *) |
| Release the underlying device allocations. More...
|
|
template<typename ValueType > |
ValueType * | asMpiPointer (DeviceBuffer< ValueType > &buffer) |
|
template<typename ValueType >
Performs the device-to-host data copy, synchronous or asynchronously on request.
Unlike in CUDA and OpenCL, synchronous call does not guarantee that all previously submitted operations are complete, only the ones that are required for buffer
consistency.
- Template Parameters
-
ValueType | Raw value type of the buffer . |
- Parameters
-
[in,out] | hostBuffer | Pointer to the raw host-side memory, also typed ValueType |
[in] | buffer | Pointer to the device-side buffer. |
[in] | startingOffset | Offset (in values) at the device-side buffer to copy from. |
[in] | numValues | Number of values to copy. |
[in] | deviceStream | GPU stream to perform asynchronous copy in. |
[in] | transferKind | Copy type: synchronous or asynchronous. |
[out] | timingEvent | A pointer to the H2D copy timing event to be filled in. Ignored in SYCL. |
template<typename ValueType >
Performs the host-to-device data copy, synchronous or asynchronously on request.
Unlike in CUDA and OpenCL, synchronous call does not guarantee that all previously submitted operations are complete, only the ones that are required for buffer
consistency.
- Template Parameters
-
ValueType | Raw value type of the buffer . |
- Parameters
-
[in,out] | buffer | Pointer to the device-side buffer. |
[in] | hostBuffer | Pointer to the raw host-side memory, also typed ValueType . |
[in] | startingOffset | Offset (in values) at the device-side buffer to copy into. |
[in] | numValues | Number of values to copy. |
[in] | deviceStream | GPU stream to perform asynchronous copy in. |
[in] | transferKind | Copy type: synchronous or asynchronous. |
[out] | timingEvent | A pointer to the H2D copy timing event to be filled in. Ignored in SYCL. |