Gromacs
2025.0-dev-20241014-f673b97
|
#include "gmxpre.h"
#include "gpu_3dfft_sycl_rocfft.h"
#include <vector>
#include "gromacs/gpu_utils/device_stream.h"
#include "gromacs/gpu_utils/devicebuffer.h"
#include "gromacs/utility/enumerationhelpers.h"
#include "gromacs/utility/exceptions.h"
#include "gromacs/utility/gmxassert.h"
#include "rocfft_common_utils.h"
Implements GPU 3D FFT routines for hipSYCL via rocFFT.
For hipSYCL, in order to call FFT APIs from the respective vendors using the same DeviceStream as other operations, a vendor extension called "custom operations" is used (see hipSYCL doc/enqueue-custom-operation.md). That effectively enqueues an asynchronous host-side lambda into the same queue. The body of the lambda unpacks the runtime data structures to get the native handles and calls the native FFT APIs.
hipSYCL queues operate at a higher level of abstraction than hip streams, with the runtime distributing work to the latter to balance load. It is possible to set the HIP stream in rocfft_execution_info, but then there is no guarantee that a subsequent queue item will run using the same stream. So we currently do not attempt to set the stream.
Classes | |
class | gmx::Gpu3dFft::ImplSyclRocfft::Impl |
Impl class. More... | |
Functions | |
RocfftPlan | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makePlan (const std::string &descriptiveString, rocfft_transform_type transformType, const PlanSetupData &inputPlanSetupData, const PlanSetupData &outputPlanSetupData, ArrayRef< const size_t > rocfftRealGridSize, const DeviceStream &pmeStream) |
Prepare plans for the forward and reverse transformation. More... | |
gmx::makePlan ("complex-to-real", rocfft_transform_type_real_inverse, PlanSetupData{rocfft_array_type_hermitian_interleaved, makeComplexStrides(complexGridSizePadded), computeTotalSize(complexGridSizePadded)}, PlanSetupData{rocfft_array_type_real, makeRealStrides(realGridSizePadded), computeTotalSize(realGridSizePadded)}, std::vector< size_t >{size_t(realGridSize[ZZ]), size_t(realGridSize[YY]), size_t(realGridSize[XX])}, pmeStream) | |
realGrid_ * | gmx::realGrid ()),{GMX_RELEASE_ASSERT(performOutOfPlaceFFT,"Only out-of-place FFT is implemented in hipSYCL" |
gmx::GMX_RELEASE_ASSERT (allocateRealGrid==false,"Grids need to be pre-allocated") | |
gmx::GMX_RELEASE_ASSERT (gridSizesInXForEachRank.size()==1 &&gridSizesInYForEachRank.size()==1,"FFT decomposition not implemented with the SYCL rocFFT backend") | |
Variables | |
gmx::pmeStream | |