Gromacs  2025-dev-20240614-602a366
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Functions
#include "gromacs/gpu_utils/devicebuffer_datatype.h"
+ Include dependency graph for pme_gpu_grid.h:
+ This graph shows which files directly or indirectly include this file:

Description

Implements PME halo exchange and PME-FFT grid conversion functions.

Author
Gaurav Garg gauga.nosp@m.rg@n.nosp@m.vidia.nosp@m..com

Functions

void pmeGpuGridHaloExchange (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle)
 Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation. More...
 
void pmeGpuGridHaloExchangeReverse (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle)
 Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation. More...
 
template<bool pmetofft>
void convertPmeGridToFftGrid (const PmeGpu *pmeGpu, float *h_fftRealGrid, gmx_parallel_3dfft *fftSetup, int gridIndex)
 Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition. More...
 
template<bool pmetofft>
void convertPmeGridToFftGrid (const PmeGpu *pmeGpu, DeviceBuffer< float > *d_fftRealGrid, int gridIndex)
 Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition. More...
 
template void convertPmeGridToFftGrid< true > (const PmeGpu *, float *, gmx_parallel_3dfft *, int)
 
template void convertPmeGridToFftGrid< false > (const PmeGpu *, float *, gmx_parallel_3dfft *, int)
 
template void convertPmeGridToFftGrid< true > (const PmeGpu *, DeviceBuffer< float > *, int)
 
template void convertPmeGridToFftGrid< false > (const PmeGpu *, DeviceBuffer< float > *, int)
 

Function Documentation

template<bool pmetofft>
void convertPmeGridToFftGrid ( const PmeGpu pmeGpu,
float *  h_fftRealGrid,
gmx_parallel_3dfft *  fftSetup,
int  gridIndex 
)

Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition.

Parameters
[in]pmeGpuThe PME GPU structure.
[in]h_fftRealGridFFT grid on host
[in]fftSetupHost FFT setup structure
[in]gridIndexGrid index which is to be converted
Template Parameters
pmeToFftA boolean which tells if this is conversion from PME grid to FFT grid or reverse
template<bool pmetofft>
void convertPmeGridToFftGrid ( const PmeGpu pmeGpu,
DeviceBuffer< float > *  d_fftRealGrid,
int  gridIndex 
)

Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition.

Parameters
[in]pmeGpuThe PME GPU structure.
[in]d_fftRealGridFFT grid on device
[in]gridIndexGrid index which is to be converted
Template Parameters
pmeToFftA boolean which tells if this is conversion from PME grid to FFT grid or reverse
void pmeGpuGridHaloExchange ( const PmeGpu pmeGpu,
gmx_wallcycle *  wcycle 
)

Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation.

Parameters
[in]pmeGpuThe PME GPU structure.
[in]wcycleThe wallclock counter.
void pmeGpuGridHaloExchangeReverse ( const PmeGpu pmeGpu,
gmx_wallcycle *  wcycle 
)

Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation.

Parameters
[in]pmeGpuThe PME GPU structure.
[in]wcycleThe wallclock counter.