Gromacs
2025-dev-20241002-88a4191
|
#include <gromacs/hardware/hardwaretopology.h>
Information about packages, cores, processing units, numa, caches.
This class is the main GROMACS interface to provide information about the hardware of the system we are running on. Internally, it uses either hwloc for full or almost-full information, or a fallback implementation that relies on Linux sysinfo.
You should always use this class to query the hardware layout in user code. Note that you cannot rely on any information being present, but you must check with the supportLevel() method before trying to access any information.
Classes | |
struct | Cache |
Information about a single cache level. More... | |
struct | Core |
Information about a single core in a package. More... | |
struct | Device |
Information about a single PCI device. More... | |
struct | LogicalProcessor |
Information about package, core and processing unit for a logical processor. More... | |
struct | Machine |
Hardware topology information about the entire machine. More... | |
struct | Numa |
Information about all numa nodes. More... | |
struct | NumaNode |
Information about each numa node in system. More... | |
struct | Package |
Information about a single package in the system. More... | |
struct | ProcessingUnit |
Information about a single processing unit (in a core) More... | |
Public Types | |
enum | SupportLevel { SupportLevel::None, SupportLevel::LogicalProcessorCount, SupportLevel::Basic, SupportLevel::Full, SupportLevel::FullWithDevices } |
Amount of topology information present (incremental) More... | |
Public Member Functions | |
HardwareTopology (int logicalProcessorCount) | |
Creates mock topology with given number of logical cores. More... | |
HardwareTopology (const std::map< int, std::array< int, 3 >> &logicalProcessorIdMap, const std::string &filesystemRoot) | |
Creates mock topology based on APIC (or similar) CPU indices. More... | |
HardwareTopology (const std::string &filesystemRoot, const std::vector< int > &allowedProcessors) | |
Creates mock topology by parsing mock Linux sys/fs path. More... | |
SupportLevel | supportLevel () const |
Check what topology information is available and valid. More... | |
bool | isThisSystem () const |
Return true if we actually detected hardware. More... | |
const Machine & | machine () const |
Return the machine topology tree. More... | |
float | cpuLimit () const |
Practical max cpu load, as limited by the OS. More... | |
int | maxThreads () const |
Recommended max number of active threads. More... | |
Static Public Member Functions | |
static HardwareTopology | detect () |
Detects the hardware topology. | |
|
strong |
Amount of topology information present (incremental)
For the LogicalProcessorCount alternative, the value of maxThreads() will primarily reflect the allowed cpuLimit on the machine so we don't overload it. If we could not find that, we will have tried to make it reflect only the cpus on which we are allowed to run. Failing that too, it will correspond to the total number of logical cpus.
Enumerator | |
---|---|
None |
No hardware information whatsoever. Sorry. |
LogicalProcessorCount |
Only total processor count. |
Basic |
Package, core and processing unit for allowed processors. |
Full |
Cache, memory and numa node info. |
FullWithDevices |
Information about devices on the PCI bus. |
|
explicit |
Creates mock topology with given number of logical cores.
The support level will be None if the argument is 0 or smaller, otherwise LogicalProcessorCount.
Intended for testing of code that uses the hardware topology.
|
explicit |
Creates mock topology based on APIC (or similar) CPU indices.
This routine assembles a fake hardware topology based on a vector containing indices describing each logical processor, a second vector describing what processors we should be allowed to run on, and a path to a (fake) filesystem optionally containing cgroup information to detect the recommended maximum load.
Intended for testing of code that uses the hardware topology.
logicalProcessorIdMap | Each key in this map is the (OS-provided) index of a logical processor on which we are allowed to run. The value is an array with three integers describing (1) the packageId in the machine, (2) the coreId in the package, and (3) the hardware thread or "processing unit" id in the core. Note that these indices do NOT have to be ranks, only unique. In fact, it is common e.g. on x86 that the low-level core indices are based on connectivity, so the core ids in a 12-core CPU might be 0-5 and 8-13. For x86 systems, you can extract an initializer list for this parameter by running the standalone version of our cpuinfo tool with the hidden/debug "-topology" option. |
filesystemRoot | Path to (fake) filesystem where we attempt to parse the allowed cpu load - all file paths mentioned should be relative to the provided root. We will first try to find if cgroups2 or cgroups1 is mounted by checking /etc/mtab. The cgroup of the process will be parsed from /proc/self/cgroup, and we also always add the blank top-level cgroup ("/") if the specific cgroup is not found in the next step. For this cgroup, we parse the mount path specified in /etc/mtab. For cgroups2, the cpu limit is specified in cpu.max, but note that this file can be absent if no limit is set. This file contains two numbers where the first is the quota of this process, and the second the period, both usually specified as microseconds. Note that the quota can be larger than the period if we allow loads above 1.0. For cgroups1, we first locate the cgroups subgroup, but then instead look in the subdirectory "cpu,cpuacct" and get the quota from cpu.cfs_quota_us and period from cpu.cfs_period_us. The tests directory in the hardware module contains a simple script that can capture these files from a Linux system. |
|
explicit |
Creates mock topology by parsing mock Linux sys/fs path.
Create mock hardware topology by attempting to parse processor information from mock Linux sys/fs path.
Intended for testing of code that uses the hardware topology.
filesystemRoot | Path to (fake) filesystem where we will first find all logical cpus from /sys/devices/system/cpu/possible, after which the topology indices for processor XX are read from the directory /sys/devices/system/cpu/cpuXX/topology . The package id is read from the file physical_package_id, the core id in the package from core_id, and then we assume the hardware thread/processing unit ids are assigned in the enumeration order of the logical processors. After this, we also look for cpu load limits specified with cgroups, as described in the other constructor above. The tests directory in the hardware module contains a simple script that can capture these files from a Linux system. |
allowedProcessors | Vector containing the logical (OS) processor indices that should be retained in the topology, mocking the logical processors that are enabled in our cpu mask. |
|
inline |
Practical max cpu load, as limited by the OS.
In some cases, in particular when running in containers, the total number of logical processors on which we re allowed to execute can be quite large, while there is a relatively low limit on the amount of CPU time the process can consume. In this case it will be much better to limit ourselves based on the amount of CPU we can use to improve scaling and avoid extra I/O.
You can always call this routine, but if sufficient support is not available, it may just return 0.0.
|
inline |
Return true if we actually detected hardware.
|
inline |
Return the machine topology tree.
You can always call this routine, but be aware that some or all contents will not be valid unless supportLevel() returns a sufficient level.
While data that is not valid has been initialized to special values, you should not rely on those but query the supportLevel() method before accessing it.
|
inline |
Recommended max number of active threads.
This method provides a recommendation for how many active cpu-consuming threads to start while taking cpuLimit into account. It is not a hard limit on threads, and since it is based on limits on the cpu load there could be cases where it is wise to start additional threads e.g. for handling I/O.
|
inline |
Check what topology information is available and valid.
The amount of hardware topology information that can be detected depends on both the hardware and whether GROMACS was linked with the external hwloc library.