Gromacs  2025-dev-20240913-b871546
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Enumerations | Functions
#include <memory>
#include "gromacs/mdtypes/commrec.h"
+ Include dependency graph for dlbtiming.h:
+ This graph shows which files directly or indirectly include this file:

Description

This file declares functions for timing the load imbalance due to domain decomposition.

Author
Berk Hess hess@.nosp@m.kth..nosp@m.se

Classes

class  DDBalanceRegionHandler
 Manager for starting and stopping the dynamic load balancing region. More...
 
class  BalanceRegion
 Object that describes a DLB balancing region. More...
 

Enumerations

enum  DdAllowBalanceRegionReopen { DdAllowBalanceRegionReopen::no, DdAllowBalanceRegionReopen::yes }
 Tells if we should open the balancing region. More...
 
enum  DdBalanceRegionWaitedForGpu { DdBalanceRegionWaitedForGpu::no, DdBalanceRegionWaitedForGpu::yes }
 Tells if we had to wait for a GPU to finish computation. More...
 

Functions

void ddReopenBalanceRegionCpu (const gmx_domdec_t *dd)
 Re-open the, already opened, load balance timing region. More...
 
void dd_force_flop_start (struct gmx_domdec_t *dd, t_nrnb *nrnb)
 Start the force flop count.
 
void dd_force_flop_stop (struct gmx_domdec_t *dd, t_nrnb *nrnb)
 Stop the force flop count.
 
void clear_dd_cycle_counts (gmx_domdec_t *dd)
 Clear the cycle counts used for tuning.
 

Enumeration Type Documentation

Tells if we should open the balancing region.

Enumerator
no 

Do not allow opening an already open region.

yes 

Allow opening an already open region.

Tells if we had to wait for a GPU to finish computation.

Enumerator
no 

The GPU finished computation before the CPU needed the result.

yes 

We had to wait for the GPU to finish computation.

Function Documentation

void ddReopenBalanceRegionCpu ( const gmx_domdec_t *  dd)

Re-open the, already opened, load balance timing region.

This function should be called after every MPI communication that occurs in the main MD loop. Note that the current setup assumes that all MPI communication acts like a global barrier. But if some ranks don't participate in communication or if some ranks communicate faster with neighbors than others, the obtained timings might not accurately reflect the computation time.

Parameters
[in,out]ddThe domain decomposition struct