Gromacs  2025-dev-20241002-88a4191
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Classes | Public Types | Public Member Functions | Static Public Member Functions
gmx::HardwareTopology Class Reference

#include <gromacs/hardware/hardwaretopology.h>

Description

Information about packages, cores, processing units, numa, caches.

This class is the main GROMACS interface to provide information about the hardware of the system we are running on. Internally, it uses either hwloc for full or almost-full information, or a fallback implementation that relies on Linux sysinfo.

You should always use this class to query the hardware layout in user code. Note that you cannot rely on any information being present, but you must check with the supportLevel() method before trying to access any information.

Classes

struct  Cache
 Information about a single cache level. More...
 
struct  Core
 Information about a single core in a package. More...
 
struct  Device
 Information about a single PCI device. More...
 
struct  LogicalProcessor
 Information about package, core and processing unit for a logical processor. More...
 
struct  Machine
 Hardware topology information about the entire machine. More...
 
struct  Numa
 Information about all numa nodes. More...
 
struct  NumaNode
 Information about each numa node in system. More...
 
struct  Package
 Information about a single package in the system. More...
 
struct  ProcessingUnit
 Information about a single processing unit (in a core) More...
 

Public Types

enum  SupportLevel {
  SupportLevel::None, SupportLevel::LogicalProcessorCount, SupportLevel::Basic, SupportLevel::Full,
  SupportLevel::FullWithDevices
}
 Amount of topology information present (incremental) More...
 

Public Member Functions

 HardwareTopology (int logicalProcessorCount)
 Creates mock topology with given number of logical cores. More...
 
 HardwareTopology (const std::map< int, std::array< int, 3 >> &logicalProcessorIdMap, const std::string &filesystemRoot)
 Creates mock topology based on APIC (or similar) CPU indices. More...
 
 HardwareTopology (const std::string &filesystemRoot, const std::vector< int > &allowedProcessors)
 Creates mock topology by parsing mock Linux sys/fs path. More...
 
SupportLevel supportLevel () const
 Check what topology information is available and valid. More...
 
bool isThisSystem () const
 Return true if we actually detected hardware. More...
 
const Machinemachine () const
 Return the machine topology tree. More...
 
float cpuLimit () const
 Practical max cpu load, as limited by the OS. More...
 
int maxThreads () const
 Recommended max number of active threads. More...
 

Static Public Member Functions

static HardwareTopology detect ()
 Detects the hardware topology.
 

Member Enumeration Documentation

Amount of topology information present (incremental)

For the LogicalProcessorCount alternative, the value of maxThreads() will primarily reflect the allowed cpuLimit on the machine so we don't overload it. If we could not find that, we will have tried to make it reflect only the cpus on which we are allowed to run. Failing that too, it will correspond to the total number of logical cpus.

Enumerator
None 

No hardware information whatsoever. Sorry.

LogicalProcessorCount 

Only total processor count.

Basic 

Package, core and processing unit for allowed processors.

Full 

Cache, memory and numa node info.

FullWithDevices 

Information about devices on the PCI bus.

Constructor & Destructor Documentation

gmx::HardwareTopology::HardwareTopology ( int  logicalProcessorCount)
explicit

Creates mock topology with given number of logical cores.

The support level will be None if the argument is 0 or smaller, otherwise LogicalProcessorCount.

Intended for testing of code that uses the hardware topology.

gmx::HardwareTopology::HardwareTopology ( const std::map< int, std::array< int, 3 >> &  logicalProcessorIdMap,
const std::string &  filesystemRoot 
)
explicit

Creates mock topology based on APIC (or similar) CPU indices.

This routine assembles a fake hardware topology based on a vector containing indices describing each logical processor, a second vector describing what processors we should be allowed to run on, and a path to a (fake) filesystem optionally containing cgroup information to detect the recommended maximum load.

Intended for testing of code that uses the hardware topology.

Parameters
logicalProcessorIdMapEach key in this map is the (OS-provided) index of a logical processor on which we are allowed to run. The value is an array with three integers describing (1) the packageId in the machine, (2) the coreId in the package, and (3) the hardware thread or "processing unit" id in the core. Note that these indices do NOT have to be ranks, only unique. In fact, it is common e.g. on x86 that the low-level core indices are based on connectivity, so the core ids in a 12-core CPU might be 0-5 and 8-13. For x86 systems, you can extract an initializer list for this parameter by running the standalone version of our cpuinfo tool with the hidden/debug "-topology" option.
filesystemRootPath to (fake) filesystem where we attempt to parse the allowed cpu load - all file paths mentioned should be relative to the provided root. We will first try to find if cgroups2 or cgroups1 is mounted by checking /etc/mtab. The cgroup of the process will be parsed from /proc/self/cgroup, and we also always add the blank top-level cgroup ("/") if the specific cgroup is not found in the next step. For this cgroup, we parse the mount path specified in /etc/mtab. For cgroups2, the cpu limit is specified in cpu.max, but note that this file can be absent if no limit is set. This file contains two numbers where the first is the quota of this process, and the second the period, both usually specified as microseconds. Note that the quota can be larger than the period if we allow loads above 1.0. For cgroups1, we first locate the cgroups subgroup, but then instead look in the subdirectory "cpu,cpuacct" and get the quota from cpu.cfs_quota_us and period from cpu.cfs_period_us. The tests directory in the hardware module contains a simple script that can capture these files from a Linux system.
gmx::HardwareTopology::HardwareTopology ( const std::string &  filesystemRoot,
const std::vector< int > &  allowedProcessors 
)
explicit

Creates mock topology by parsing mock Linux sys/fs path.

Create mock hardware topology by attempting to parse processor information from mock Linux sys/fs path.

Intended for testing of code that uses the hardware topology.

Parameters
filesystemRootPath to (fake) filesystem where we will first find all logical cpus from /sys/devices/system/cpu/possible, after which the topology indices for processor XX are read from the directory /sys/devices/system/cpu/cpuXX/topology . The package id is read from the file physical_package_id, the core id in the package from core_id, and then we assume the hardware thread/processing unit ids are assigned in the enumeration order of the logical processors. After this, we also look for cpu load limits specified with cgroups, as described in the other constructor above. The tests directory in the hardware module contains a simple script that can capture these files from a Linux system.
allowedProcessorsVector containing the logical (OS) processor indices that should be retained in the topology, mocking the logical processors that are enabled in our cpu mask.

Member Function Documentation

float gmx::HardwareTopology::cpuLimit ( ) const
inline

Practical max cpu load, as limited by the OS.

In some cases, in particular when running in containers, the total number of logical processors on which we re allowed to execute can be quite large, while there is a relatively low limit on the amount of CPU time the process can consume. In this case it will be much better to limit ourselves based on the amount of CPU we can use to improve scaling and avoid extra I/O.

You can always call this routine, but if sufficient support is not available, it may just return 0.0.

bool gmx::HardwareTopology::isThisSystem ( ) const
inline

Return true if we actually detected hardware.

Returns
This method will normally return true, when we actually ran the hardware detection as part of this process to construct the object. It will be false when the object was constructed by reading a cached XML file, or possibly generated from synthetic data.
const Machine& gmx::HardwareTopology::machine ( ) const
inline

Return the machine topology tree.

You can always call this routine, but be aware that some or all contents will not be valid unless supportLevel() returns a sufficient level.

While data that is not valid has been initialized to special values, you should not rely on those but query the supportLevel() method before accessing it.

int gmx::HardwareTopology::maxThreads ( ) const
inline

Recommended max number of active threads.

This method provides a recommendation for how many active cpu-consuming threads to start while taking cpuLimit into account. It is not a hard limit on threads, and since it is based on limits on the cpu load there could be cases where it is wise to start additional threads e.g. for handling I/O.

Returns
Recommended number of threads to start, or 0 if we could not detect.
SupportLevel gmx::HardwareTopology::supportLevel ( ) const
inline

Check what topology information is available and valid.

The amount of hardware topology information that can be detected depends on both the hardware and whether GROMACS was linked with the external hwloc library.

  • If supportLevel is SupportLevel::None, we simply don't have any valid information - not even the number of logical processors.
  • If we have at least SupportLevel::LogicalProcessorCount, we have detected the number of logical CPUs in the system, but we might not have any information whether we are allowed to run on all of them or not. To decide how many threads to start, always use the maxThreads() method, since we might have been able to detect that we are only allowed to run on a subset, e.g. when executing in a container environment where the machine has many cores, but the cpu limit for our container is low.
  • If supportLevel is at least SupportLevel::Basic, we have valid topology information about packages, cores and processing units both in the tree-like structure starting with machine().packages, in the linear logicalProcessors vector, and the map osIdToPuId is also valid.
  • If supportLevel is at least SupportLevel::Full, the cache and ccNUMA information is also valid.
  • Finally, for SupportLevel::FullWithDevices, the device structure is valid in addition to everything else.

The documentation for this class was generated from the following files: