Gromacs
2026.0-dev-20241204-d69d709
|
#include "gmxpre.h"
#include <hip/hip_profile.h>
#include "gromacs/gpu_utils/hiputils.h"
#include "gromacs/hardware/device_information.h"
#include "gromacs/utility/logger.h"
#include "gromacs/utility/stringutil.h"
#include "gpu_utils.h"
Define functions for detection and initialization for HIP devices.
Functions | |
bool | isHostMemoryPinned (const void *h_ptr) |
Tells whether the host buffer was pinned for non-blocking transfers. Only implemented for CUDA. | |
void | startGpuProfiler () |
Starts the GPU profiler if mdrun is being profiled. More... | |
void | stopGpuProfiler () |
Stops the CUDA profiler if mdrun is being profiled. More... | |
void | resetGpuProfiler () |
Resets the GPU profiler if mdrun is being profiled. More... | |
static void | peerAccessCheckStat (const hipError_t stat, const int gpuA, const int gpuB, const gmx::MDLogger &mdlog, const char *hipCallName) |
Check and act on status returned from peer access HIP call. More... | |
void | setupGpuDevicePeerAccess (gmx::ArrayRef< const int > gpuIdsToUse, const gmx::MDLogger &mdlog) |
Enable peer access between GPUs where supported. More... | |
void | checkPendingDeviceErrorBetweenSteps () |
Check for API errors to avoid propagating these across e.g. MD steps. | |
|
static |
Check and act on status returned from peer access HIP call.
If status is "hipSuccess", we continue. If "hipErrorPeerAccessAlreadyEnabled", then peer access has already been enabled so we ignore. If "hipErrorInvalidDevice" then the run is trying to access an invalid GPU, so we throw an error. If "hipErrorInvalidValue" then there is a problem with the arguments to the HIP call, and we throw an error. These cover all expected statuses, but if any other is returned we issue a warning and continue.
[in] | stat | HIP call return status |
[in] | gpuA | ID for GPU initiating peer access call |
[in] | gpuB | ID for remote GPU |
[in] | mdlog | Logger object |
[in] | hipCallName | name of HIP peer access call |
void resetGpuProfiler | ( | ) |
Resets the GPU profiler if mdrun is being profiled.
When a profiler run is in progress (based on the presence of the NVPROF_ID env. var.), the profiler data is restet in order to eliminate the data collected from the preceding part fo the run.
This function should typically be called at the mdrun counter reset time.
Note that this is implemented only for the CUDA API.
void setupGpuDevicePeerAccess | ( | gmx::ArrayRef< const int > | gpuIdsToUse, |
const gmx::MDLogger & | mdlog | ||
) |
Enable peer access between GPUs where supported.
[in] | gpuIdsToUse | List of GPU IDs in use |
[in] | mdlog | Logger object |
void startGpuProfiler | ( | ) |
Starts the GPU profiler if mdrun is being profiled.
When a profiler run is in progress (based on the presence of the NVPROF_ID env. var.), the profiler is started to begin collecting data during the rest of the run (or until stopGpuProfiler is called).
Note that this is implemented only for the CUDA API.
void stopGpuProfiler | ( | ) |
Stops the CUDA profiler if mdrun is being profiled.
This function can be called at cleanup when skipping recording recording subsequent API calls from being traces/profiled is desired, e.g. before uninitialization.
Note that this is implemented only for the CUDA API.