Implements PME halo exchange and PME-FFT grid conversion functions.
- Author
- Gaurav Garg gauga.nosp@m.rg@n.nosp@m.vidia.nosp@m..com
|
void | pmeGpuGridHaloExchange (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle) |
| Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation. More...
|
|
void | pmeGpuGridHaloExchangeReverse (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle) |
| Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation. More...
|
|
template<bool pmetofft> |
void | convertPmeGridToFftGrid (const PmeGpu *pmeGpu, float *h_fftRealGrid, gmx_parallel_3dfft *fftSetup, int gridIndex) |
| Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition. More...
|
|
template<bool pmetofft> |
void | convertPmeGridToFftGrid (const PmeGpu *pmeGpu, DeviceBuffer< float > *d_fftRealGrid, int gridIndex) |
| Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition. More...
|
|
template void | convertPmeGridToFftGrid< true > (const PmeGpu *, float *, gmx_parallel_3dfft *, int) |
|
template void | convertPmeGridToFftGrid< false > (const PmeGpu *, float *, gmx_parallel_3dfft *, int) |
|
template void | convertPmeGridToFftGrid< true > (const PmeGpu *, DeviceBuffer< float > *, int) |
|
template void | convertPmeGridToFftGrid< false > (const PmeGpu *, DeviceBuffer< float > *, int) |
|
template<bool pmetofft>
void convertPmeGridToFftGrid |
( |
const PmeGpu * |
pmeGpu, |
|
|
float * |
h_fftRealGrid, |
|
|
gmx_parallel_3dfft * |
fftSetup, |
|
|
int |
gridIndex |
|
) |
| |
Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition.
- Parameters
-
[in] | pmeGpu | The PME GPU structure. |
[in] | h_fftRealGrid | FFT grid on host |
[in] | fftSetup | Host FFT setup structure |
[in] | gridIndex | Grid index which is to be converted |
- Template Parameters
-
pmeToFft | A boolean which tells if this is conversion from PME grid to FFT grid or reverse |
template<bool pmetofft>
void convertPmeGridToFftGrid |
( |
const PmeGpu * |
pmeGpu, |
|
|
DeviceBuffer< float > * |
d_fftRealGrid, |
|
|
int |
gridIndex |
|
) |
| |
Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition.
- Parameters
-
[in] | pmeGpu | The PME GPU structure. |
[in] | d_fftRealGrid | FFT grid on device |
[in] | gridIndex | Grid index which is to be converted |
- Template Parameters
-
pmeToFft | A boolean which tells if this is conversion from PME grid to FFT grid or reverse |
void pmeGpuGridHaloExchange |
( |
const PmeGpu * |
pmeGpu, |
|
|
gmx_wallcycle * |
wcycle |
|
) |
| |
Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation.
- Parameters
-
[in] | pmeGpu | The PME GPU structure. |
[in] | wcycle | The wallclock counter. |
void pmeGpuGridHaloExchangeReverse |
( |
const PmeGpu * |
pmeGpu, |
|
|
gmx_wallcycle * |
wcycle |
|
) |
| |
Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation.
- Parameters
-
[in] | pmeGpu | The PME GPU structure. |
[in] | wcycle | The wallclock counter. |