Implements PME halo exchange and PME-FFT grid conversion functions.
- Author
- Gaurav Garg gauga.nosp@m.rg@n.nosp@m.vidia.nosp@m..com
|
| void | pmeGpuGridHaloExchange (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle) |
| | Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation. More...
|
| |
| void | pmeGpuGridHaloExchangeReverse (const PmeGpu *pmeGpu, gmx_wallcycle *wcycle) |
| | Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation. More...
|
| |
| template<bool pmetofft> |
| void | convertPmeGridToFftGrid (const PmeGpu *pmeGpu, float *h_fftRealGrid, gmx_parallel_3dfft *fftSetup, int gridIndex) |
| | Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition. More...
|
| |
| template<bool pmetofft> |
| void | convertPmeGridToFftGrid (const PmeGpu *pmeGpu, DeviceBuffer< float > *d_fftRealGrid, int gridIndex) |
| | Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition. More...
|
| |
|
template void | convertPmeGridToFftGrid< true > (const PmeGpu *, float *, gmx_parallel_3dfft *, int) |
| |
|
template void | convertPmeGridToFftGrid< false > (const PmeGpu *, float *, gmx_parallel_3dfft *, int) |
| |
|
template void | convertPmeGridToFftGrid< true > (const PmeGpu *, DeviceBuffer< float > *, int) |
| |
|
template void | convertPmeGridToFftGrid< false > (const PmeGpu *, DeviceBuffer< float > *, int) |
| |
template<bool pmetofft>
| void convertPmeGridToFftGrid |
( |
const PmeGpu * |
pmeGpu, |
|
|
float * |
h_fftRealGrid, |
|
|
gmx_parallel_3dfft * |
fftSetup, |
|
|
int |
gridIndex |
|
) |
| |
Copy PME Grid with overlap region to host FFT grid and vice-versa. Used in mixed mode PME decomposition.
- Parameters
-
| [in] | pmeGpu | The PME GPU structure. |
| [in] | h_fftRealGrid | FFT grid on host |
| [in] | fftSetup | Host FFT setup structure |
| [in] | gridIndex | Grid index which is to be converted |
- Template Parameters
-
| pmeToFft | A boolean which tells if this is conversion from PME grid to FFT grid or reverse |
template<bool pmetofft>
| void convertPmeGridToFftGrid |
( |
const PmeGpu * |
pmeGpu, |
|
|
DeviceBuffer< float > * |
d_fftRealGrid, |
|
|
int |
gridIndex |
|
) |
| |
Copy PME Grid with overlap region to device FFT grid and vice-versa. Used in full GPU PME decomposition.
- Parameters
-
| [in] | pmeGpu | The PME GPU structure. |
| [in] | d_fftRealGrid | FFT grid on device |
| [in] | gridIndex | Grid index which is to be converted |
- Template Parameters
-
| pmeToFft | A boolean which tells if this is conversion from PME grid to FFT grid or reverse |
| void pmeGpuGridHaloExchange |
( |
const PmeGpu * |
pmeGpu, |
|
|
gmx_wallcycle * |
wcycle |
|
) |
| |
Grid Halo exchange after PME spread ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after spread. Consider using events for this synchnozation.
- Parameters
-
| [in] | pmeGpu | The PME GPU structure. |
| [in] | wcycle | The wallclock counter. |
| void pmeGpuGridHaloExchangeReverse |
( |
const PmeGpu * |
pmeGpu, |
|
|
gmx_wallcycle * |
wcycle |
|
) |
| |
Grid reverse Halo exchange before PME gather ToDo: Current implementation transfers halo region from/to only immediate neighbours And, expects that overlapSize <= local grid width. Implement exchange with multiple neighbors to remove this limitation ToDo: Current implementation synchronizes pmeStream to make sure data is ready on GPU after FFT to PME grid conversion. Consider using events for this synchnozation.
- Parameters
-
| [in] | pmeGpu | The PME GPU structure. |
| [in] | wcycle | The wallclock counter. |