Gromacs  2024.2
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Enumerations | Functions | Variables
gpu_3dfft_sycl_rocfft.cpp File Reference
#include "gmxpre.h"
#include "gpu_3dfft_sycl_rocfft.h"
#include <vector>
#include "gromacs/gpu_utils/device_stream.h"
#include "gromacs/gpu_utils/devicebuffer.h"
#include "gromacs/utility/enumerationhelpers.h"
#include "gromacs/utility/exceptions.h"
#include "gromacs/utility/gmxassert.h"
#include "rocfft.h"
+ Include dependency graph for gpu_3dfft_sycl_rocfft.cpp:

Description

Implements GPU 3D FFT routines for hipSYCL via rocFFT.

Author
Andrey Alekseenko al42a.nosp@m.nd@g.nosp@m.mail..nosp@m.com
Mark Abraham mark..nosp@m.j.ab.nosp@m.raham.nosp@m.@gma.nosp@m.il.co.nosp@m.m

For hipSYCL, in order to call FFT APIs from the respective vendors using the same DeviceStream as other operations, a vendor extension called "custom operations" is used (see hipSYCL doc/enqueue-custom-operation.md). That effectively enqueues an asynchronous host-side lambda into the same queue. The body of the lambda unpacks the runtime data structures to get the native handles and calls the native FFT APIs.

hipSYCL queues operate at a higher level of abstraction than hip streams, with the runtime distributing work to the latter to balance load. It is possible to set the HIP stream in rocfft_execution_info, but then there is no guarantee that a subsequent queue item will run using the same stream. So we currently do not attempt to set the stream.

Classes

class  gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::RocfftInitializer
 Provides RAII-style initialization of rocFFT library. More...
 
struct  gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::RocfftPlan
 All the persistent data for planning an executing a 3D FFT. More...
 
struct  gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::PlanSetupData
 Helper struct to reduce repetitive code setting up a 3D FFT plan. More...
 
class  gmx::Gpu3dFft::ImplSyclRocfft::Impl
 Impl class. More...
 

Enumerations

enum  gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::FftDirection : int { RealToComplex, ComplexToReal, Count }
 Model the kinds of 3D FFT implemented.
 

Functions

void gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::handleFftError (rocfft_status result, const std::string &msg)
 Helper for consistent error handling.
 
void gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::handleFftError (rocfft_status result, const std::string &direction, const std::string &msg)
 Helper for consistent error handling.
 
std::array< size_t, DIM > gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makeRealStrides (ivec realGridSizePadded)
 Compute the stride through the real 1D array.
 
std::array< size_t, DIM > gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makeComplexStrides (ivec complexGridSizePadded)
 Compute the stride through the complex 1D array.
 
size_t gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::computeTotalSize (ivec gridSize)
 Compute total grid size.
 
RocfftPlan gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::makePlan (const std::string &descriptiveString, rocfft_transform_type transformType, const PlanSetupData &inputPlanSetupData, const PlanSetupData &outputPlanSetupData, ArrayRef< const size_t > rocfftRealGridSize, const DeviceStream &pmeStream)
 Prepare plans for the forward and reverse transformation. More...
 
 gmx::makePlan ("complex-to-real", rocfft_transform_type_real_inverse, PlanSetupData{rocfft_array_type_hermitian_interleaved, makeComplexStrides(complexGridSizePadded), computeTotalSize(complexGridSizePadded)}, PlanSetupData{rocfft_array_type_real, makeRealStrides(realGridSizePadded), computeTotalSize(realGridSizePadded)}, std::vector< size_t >{size_t(realGridSize[ZZ]), size_t(realGridSize[YY]), size_t(realGridSize[XX])}, pmeStream)
 
realGrid_ * gmx::realGrid ()),{GMX_RELEASE_ASSERT(performOutOfPlaceFFT,"Only out-of-place FFT is implemented in hipSYCL"
 
 gmx::GMX_RELEASE_ASSERT (allocateRealGrid==false,"Grids need to be pre-allocated")
 
 gmx::GMX_RELEASE_ASSERT (gridSizesInXForEachRank.size()==1 &&gridSizesInYForEachRank.size()==1,"FFT decomposition not implemented with the SYCL rocFFT backend")
 

Variables

const std::array< const char
*, rocfft_status_invalid_work_buffer+1 > 
gmx::anonymous_namespace{gpu_3dfft_sycl_rocfft.cpp}::c_rocfftErrorStrings
 Strings that match enum rocfft_status_e in rocfft.h. More...
 
 gmx::pmeStream