Gromacs
2022-beta2
|
#include "gmxpre.h"
#include "gpu_3dfft_sycl_rocfft.h"
#include "gromacs/utility/enumerationhelpers.h"
#include "gromacs/utility/exceptions.h"
#include <vector>
#include "gromacs/gpu_utils/device_stream.h"
#include "gromacs/gpu_utils/devicebuffer.h"
#include "gromacs/utility/gmxassert.h"
#include "rocfft.h"
Implements GPU 3D FFT routines for hipSYCL via rocFFT.
For hipSYCL, in order to call FFT APIs from the respective vendors using the same DeviceStream as other operations, a vendor extension called "custom operations" is used (see hipSYCL doc/enqueue-custom-operation.md). That effectively enqueues an asynchronous host-side lambda into the same queue. The body of the lambda unpacks the runtime data structures to get the native handles and calls the native FFT APIs.
For a 3D FFT, rocFFT requires a working buffer which it allocates itself if not provided. This might be slow enough to be worth optimizing. This working buffer could be provided in advance by calling rocfft_plan_get_work_buffer_size, allocating a buffer that persists suitably, and then using rocfft_execution_info_set_work_buffer in a custom operation. See Issue #4153.
hipSYCL queues operate at a higher level of abstraction than hip streams, with the runtime distributing work to the latter to balance load. It is possible to set the HIP stream in rocfft_execution_info, but then there is no guarantee that a subsequent queue item will run using the same stream. So we currently do not attempt to set the stream.
Classes | |
class | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::RocfftInitializer |
Provides RAII-style initialization of rocFFT library. More... | |
struct | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::RocfftPlan |
All the persistent data for planning an executing a 3D FFT. More... | |
struct | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::PlanSetupData |
Helper struct to reduce repetitive code setting up a 3D FFT plan. More... | |
class | gmx::Gpu3dFft::ImplSyclRocfft::Impl |
Impl class. More... | |
Enumerations | |
enum | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::FftDirection : int { RealToComplex, ComplexToReal, Count } |
Model the kinds of 3D FFT implemented. | |
Functions | |
void | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::handleFftError (rocfft_status result, const std::string &msg) |
Helper for consistent error handling. | |
void | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::handleFftError (rocfft_status result, const std::string &direction, const std::string &msg) |
Helper for consistent error handling. | |
std::array< size_t, DIM > | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makeRealStrides (ivec realGridSizePadded) |
Compute the stride through the real 1D array. | |
std::array< size_t, DIM > | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makeComplexStrides (ivec complexGridSizePadded) |
Compute the stride through the complex 1D array. | |
size_t | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::computeTotalSize (ivec gridSize) |
Compute total grid size. | |
RocfftPlan | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makePlan (const std::string &descriptiveString, rocfft_transform_type transformType, const PlanSetupData &inputPlanSetupData, const PlanSetupData &outputPlanSetupData, ArrayRef< const size_t > rocfftRealGridSize, const DeviceStream &pmeStream) |
Prepare plans for the forward and reverse transformation. More... | |
gmx::makePlan ("complex-to-real", rocfft_transform_type_real_inverse, PlanSetupData{rocfft_array_type_hermitian_interleaved, makeComplexStrides(complexGridSizePadded), computeTotalSize(complexGridSizePadded)}, PlanSetupData{rocfft_array_type_real, makeRealStrides(realGridSizePadded), computeTotalSize(realGridSizePadded)}, std::vector< size_t >{size_t(realGridSize[ZZ]), size_t(realGridSize[YY]), size_t(realGridSize[XX])}, pmeStream) | |
realGrid_ * | gmx::realGrid ()), complexGrid_(*complexGrid->buffer_.get()),{GMX_RELEASE_ASSERT(performOutOfPlaceFFT,"Only out-of-place FFT is implemented in hipSYCL" |
gmx::GMX_RELEASE_ASSERT (allocateGrids==false,"Grids need to be pre-allocated") | |
gmx::GMX_RELEASE_ASSERT (gridSizesInXForEachRank.size()==1 &&gridSizesInYForEachRank.size()==1,"FFT decomposition not implemented with SYCL backend") | |
Variables | |
const std::array< const char *, rocfft_status_invalid_work_buffer+1 > | gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::c_rocfftErrorStrings |
Strings that match enum rocfft_status_e in rocfft.h. More... | |
gmx::pmeStream | |