Gromacs  2024.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Functions
device_management_common.cpp File Reference
#include "gmxpre.h"
#include <algorithm>
#include "gromacs/hardware/device_management.h"
#include "gromacs/utility/arrayref.h"
#include "gromacs/utility/exceptions.h"
#include "gromacs/utility/fatalerror.h"
#include "device_information.h"
+ Include dependency graph for device_management_common.cpp:

Description

Defines the implementations of device management functions that are common for CPU, CUDA and OpenCL.

Author
Anca Hamuraru anca@.nosp@m.stre.nosp@m.amcom.nosp@m.puti.nosp@m.ng.eu
Dimitrios Karkoulis dimit.nosp@m.ris..nosp@m.karko.nosp@m.ulis.nosp@m.@gmai.nosp@m.l.co.nosp@m.m
Teemu Virolainen teemu.nosp@m.@str.nosp@m.eamco.nosp@m.mput.nosp@m.ing.e.nosp@m.u
Mark Abraham mark..nosp@m.j.ab.nosp@m.raham.nosp@m.@gma.nosp@m.il.co.nosp@m.m
Szilárd Páll pall..nosp@m.szil.nosp@m.ard@g.nosp@m.mail.nosp@m..com
Artem Zhmurov zhmur.nosp@m.ov@g.nosp@m.mail..nosp@m.com

Functions

bool canPerformDeviceDetection (std::string *errorMessage)
 Return whether GPUs can be detected. More...
 
bool isDeviceDetectionEnabled ()
 Return whether GPU detection is enabled. More...
 
DeviceVendor getDeviceVendor (const char *vendorName)
 Returns an DeviceVendor value corresponding to the input OpenCL vendor name. More...
 
int getDeviceComputeUnitFactor (const DeviceInformation &deviceInfo)
 Get the factor to divide the number of compute units by. More...
 
std::vector
< std::reference_wrapper
< DeviceInformation > > 
getCompatibleDevices (const std::vector< std::unique_ptr< DeviceInformation >> &deviceInfoList)
 Return a container of device-information handles that are compatible. More...
 
std::vector< int > getCompatibleDeviceIds (gmx::ArrayRef< const std::unique_ptr< DeviceInformation >> deviceInfoList)
 Return a container of the IDs of the compatible GPU ids. More...
 
bool deviceIdIsCompatible (gmx::ArrayRef< const std::unique_ptr< DeviceInformation >> deviceInfoList, const int deviceId)
 Return whether deviceId is found in deviceInfoList and is compatible. More...
 
gmx::GpuAwareMpiStatus getMinimalSupportedGpuAwareMpiStatus (gmx::ArrayRef< const std::unique_ptr< DeviceInformation >> deviceInfoList)
 Return whether all compatible devices in deviceInfoList support GPU-aware MPI. More...
 
std::string getDeviceCompatibilityDescription (const gmx::ArrayRef< const std::unique_ptr< DeviceInformation >> deviceInfoList, int deviceId)
 Return a string describing how compatible the GPU with given deviceId is. More...
 
void serializeDeviceInformations (const std::vector< std::unique_ptr< DeviceInformation >> &deviceInfoList, gmx::ISerializer *serializer)
 Serialization of information on devices for MPI broadcasting. More...
 
std::vector< std::unique_ptr
< DeviceInformation > > 
deserializeDeviceInformations (gmx::ISerializer *serializer)
 Deserialization of information on devices after MPI broadcasting. More...
 

Function Documentation

bool canPerformDeviceDetection ( std::string *  errorMessage)

Return whether GPUs can be detected.

Returns true when this is a build of GROMACS configured to support GPU usage, GPU detection is not disabled by GMX_DISABLE_GPU_DETECTION environment variable and a valid device driver, ICD, and/or runtime was detected. Does not throw.

Parameters
[out]errorMessageWhen returning false on a build configured with GPU support and non-nullptr was passed, the string contains a descriptive message about why GPUs cannot be detected.
std::vector<std::unique_ptr<DeviceInformation> > deserializeDeviceInformations ( gmx::ISerializer serializer)

Deserialization of information on devices after MPI broadcasting.

Parameters
[in]serializerSerializing object.
Returns
deviceInfoList Deserialized vector with device informations.
bool deviceIdIsCompatible ( gmx::ArrayRef< const std::unique_ptr< DeviceInformation >>  deviceInfoList,
int  deviceId 
)

Return whether deviceId is found in deviceInfoList and is compatible.

This function filters the result of the detection for compatible GPUs, based on the previously run compatibility tests.

Parameters
[in]deviceInfoListAn information on available devices.
[in]deviceIdThe device ID to find in the list.
Exceptions
RangeErrorIf deviceId does not match the id of any device in deviceInfoList
Returns
Whether deviceId is compatible.
std::vector<int> getCompatibleDeviceIds ( gmx::ArrayRef< const std::unique_ptr< DeviceInformation >>  deviceInfoList)

Return a container of the IDs of the compatible GPU ids.

This function filters the result of the detection for compatible GPUs, based on the previously run compatibility tests.

Parameters
[in]deviceInfoListAn information on available devices.
Returns
Vector of compatible GPU ids.
std::vector<std::reference_wrapper<DeviceInformation> > getCompatibleDevices ( const std::vector< std::unique_ptr< DeviceInformation >> &  deviceInfoList)

Return a container of device-information handles that are compatible.

This function filters the result of the detection for compatible GPUs, based on the previously run compatibility tests.

Parameters
[in]deviceInfoListAn information on available devices.
Returns
Vector of DeviceInformations on GPUs recorded as compatible
std::string getDeviceCompatibilityDescription ( gmx::ArrayRef< const std::unique_ptr< DeviceInformation >>  deviceInfoList,
int  deviceId 
)

Return a string describing how compatible the GPU with given deviceId is.

Parameters
[in]deviceInfoListAn information on available devices.
[in]deviceIdAn index of the device to check
Returns
A string describing the compatibility status, useful for error messages.
int getDeviceComputeUnitFactor ( const DeviceInformation deviceInfo)

Get the factor to divide the number of compute units by.

OpenCL and SYCL can report the number of Compute Units (CUs) a device has, see CL_DEVICE_MAX_COMPUTE_UNITS and info::device::max_compute_units. But "CU" is only vaguely defined by the standard, and on different vendors the same API call returns different things.

On NVIDIA, that is the number of SMs.

On AMD, that is the number of Compute Units, which are similar to CUDA's SM. Except on RDNA, where the number of Dual Compute Units is returned (https://stackoverflow.com/a/63976796/929437).

On Intel, that is the number of EUs (XVEs), which are similar to CUDA core. The concept similar to CUDA SM is called sub-slice (Xe Core, XC), and it contains 16 EUs (Gen9-Gen11, Xe).

This function uses CUDA SM as a reference. To get the number of SM-like units on a device, divide the result of CL_DEVICE_MAX_COMPUTE_UNITS / info::device::max_compute_units API call by the value returned by this function.

Todo:
: Handled AMD RDNA?
Parameters
[in]deviceInfoDevice information.
Returns
how many CUs are there in a single SM-like entity.
DeviceVendor getDeviceVendor ( const char *  vendorName)

Returns an DeviceVendor value corresponding to the input OpenCL vendor name.

Returns
DeviceVendor value for the input vendor name
gmx::GpuAwareMpiStatus getMinimalSupportedGpuAwareMpiStatus ( gmx::ArrayRef< const std::unique_ptr< DeviceInformation >>  deviceInfoList)

Return whether all compatible devices in deviceInfoList support GPU-aware MPI.

Returns
Whether all compatible devices in the list support GPU-aware MPI (both full support and forced support counts).
bool isDeviceDetectionEnabled ( )

Return whether GPU detection is enabled.

Returns true when this is a build of GROMACS configured to support GPU usage and GPU detection is not disabled by GMX_DISABLE_GPU_DETECTION environment variable.

Does not throw.

void serializeDeviceInformations ( const std::vector< std::unique_ptr< DeviceInformation >> &  deviceInfoList,
gmx::ISerializer serializer 
)

Serialization of information on devices for MPI broadcasting.

Parameters
[in]deviceInfoListThe vector with device informations to serialize.
[in]serializerSerializing object.