Gromacs
2025-dev-20240910-a7e2421
|
#include "gmxpre.h"
#include "config.h"
#include <array>
#include <vector>
#include "gromacs/gpu_utils/ocl_compiler.h"
#include "gromacs/gpu_utils/oclraii.h"
#include "gromacs/gpu_utils/oclutils.h"
#include "gromacs/hardware/device_management.h"
#include "gromacs/utility/fatalerror.h"
#include "gromacs/utility/mpiinfo.h"
#include "gromacs/utility/smalloc.h"
#include "gromacs/utility/stringutil.h"
#include "device_information.h"
Defines the OpenCL implementations of the device management.
Functions | |
void | warnWhenDeviceNotTargeted (const gmx::MDLogger &, const DeviceInformation &) |
Warn to the logger when the detected device was not one of the targets selected at configure time for compilation. More... | |
static bool | gmx::runningOnCompatibleOSForAmd () |
Return true if executing on compatible OS for AMD OpenCL. More... | |
static bool | gmx::runningOnCompatibleHWForNvidia (const DeviceInformation &deviceInfo) |
Return true if executing on compatible GPU for NVIDIA OpenCL. More... | |
static FixedCapacityVector < int, 10 > | gmx::fillSupportedSubGroupSizes (const cl_device_id devId, const DeviceVendor deviceVendor) |
Return the list of sub-group sizes supported by the device. More... | |
static bool | gmx::runningOnCompatibleHWForAmd (const DeviceInformation &deviceInfo) |
Return true if executing on compatible GPU for AMD OpenCL. More... | |
static DeviceStatus | gmx::isDeviceFunctional (const DeviceInformation &deviceInfo) |
Checks that device deviceInfo is compatible with GROMACS. More... | |
std::string | gmx::makeOpenClInternalErrorString (const char *message, cl_int status) |
Make an error string following an OpenCL API call. More... | |
static bool | gmx::isDeviceFunctional (const DeviceInformation &deviceInfo, std::string *errorMessage) |
Checks that device deviceInfo is sane (ie can run a kernel). More... | |
static DeviceStatus | gmx::checkGpu (size_t deviceId, const DeviceInformation &deviceInfo) |
Check whether the ocl_gpu_device is suitable for use by mdrun. More... | |
bool | isDeviceDetectionFunctional (std::string *errorMessage) |
Return whether GPU detection is functioning correctly. More... | |
std::vector< std::unique_ptr < DeviceInformation > > | findDevices () |
Find all GPUs in the system. More... | |
void | setActiveDevice (const DeviceInformation &deviceInfo) |
Set the active GPU. More... | |
void | releaseDevice () |
Releases the GPU device used by the active context at the time of calling. More... | |
std::string | getDeviceInformationString (const DeviceInformation &deviceInfo) |
Formats and returns a device information string for a given GPU. More... | |
std::vector<std::unique_ptr<DeviceInformation> > findDevices | ( | ) |
Find all GPUs in the system.
Will detect every GPU supported by the device driver in use. Must only be called if canPerformDeviceDetection()
has returned true. This routine also checks for the compatibility of each device and fill the deviceInfo array with the required information on each device: ID, device properties, status.
Note that this function leaves the GPU runtime API error state clean; this is implemented ATM in the CUDA flavor. This invalidates any existing CUDA streams, allocated memory on GPU, etc.
InternalError | if a GPU API returns an unexpected failure (because the call to canDetectGpus() should always prevent this occuring) |
std::string getDeviceInformationString | ( | const DeviceInformation & | deviceInfo | ) |
Formats and returns a device information string for a given GPU.
Given an index directly into the array of available GPUs, returns a formatted info string for the respective GPU which includes ID, name, compute capability, and detection status.
[in] | deviceInfo | An information on device that is to be set. |
bool isDeviceDetectionFunctional | ( | std::string * | errorMessage | ) |
Return whether GPU detection is functioning correctly.
Returns true when this is a build of GROMACS configured to support GPU usage, and a valid device driver, ICD, and/or runtime was detected.
This function is not intended to be called from build configurations that do not support GPUs, and there will be no descriptive message in that case.
[out] | errorMessage | When returning false on a build configured with GPU support and non-nullptr was passed, the string contains a descriptive message about why GPUs cannot be detected. |
Does not throw.
void releaseDevice | ( | ) |
Releases the GPU device used by the active context at the time of calling.
With CUDA, the device is reset and therefore all data uploaded to the GPU is lost. This must only be called when none of this data is required anymore, because subsequent attempts to free memory associated with the context will otherwise fail. Calls gmx_warning
upon errors.
With other GPU SDKs, does nothing.
Should only be called after setActiveDevice
was called.
void setActiveDevice | ( | const DeviceInformation & | deviceInfo | ) |
Set the active GPU.
This sets the device for which the device information is passed active. Essential in CUDA, where the device buffers and kernel launches are not connected to the device context. In OpenCL, checks the device vendor and makes vendor-specific performance adjustments.
[in] | deviceInfo | Information on the device to be set. |
Issues a fatal error for any critical errors that occur during initialization.
void warnWhenDeviceNotTargeted | ( | const gmx::MDLogger & | mdlog, |
const DeviceInformation & | deviceInfo | ||
) |
Warn to the logger when the detected device was not one of the targets selected at configure time for compilation.
[in] | mdlog | Logger |
[in] | deviceInfo | The device to potentially warn about |