#include "config.h"
#include "gromacs/mdtypes/interaction_const.h"
#include "gromacs/mdtypes/locality.h"
#include "gromacs/utility/enumerationhelpers.h"
#include "nbnxm.h"
#include "pairlist.h"
Implements common internal types for different NBNXN GPU implementations.
- Author
- Szilárd Páll pall..nosp@m.szil.nosp@m.ard@g.nosp@m.mail.nosp@m..com
#define GMX_NBNXN_PRUNE_KERNEL_JPACKED_CONCURRENCY 4 |
Macro definining default for the prune kernel's jPacked processing concurrency.
The GMX_NBNXN_PRUNE_KERNEL_JPACKED_CONCURRENCY macro allows compile-time override with the default value of 4.
constexpr int c_sciHistogramSize = 8192 |
|
static |
Number of separate bins used during sorting of plist on gpu.
Ideally this number would be increased for very large system sizes (the cpu version of sorting uses 2 x avg(num cjPacked) but as sorting has negligible impact for very large system sizes we use a constant here for simplicity. On H100 sorting begins to have negligible effect for system sizes greater than ~400k atoms.
constexpr int c_sciSortingThreadsPerBlock = 256 |
|
static |
Number of threads per block used by the gpu sorting kernel.
TODO this is a reasonable default but the number has not been tuned