Gromacs  2024.4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Classes | Directories | Files
Parallelizable Handling of Output Data (analysisdata)
+ Collaboration diagram for Parallelizable Handling of Output Data (analysisdata):

Description

Provides functionality for handling and processing output data from analysis.

Overview

This module provides functionality to do common processing for tabular data in analysis tools. In addition to providing this common functionality, one major driver for this module is to make it simple to write analysis tools that process frames in parallel: the functionality in this module takes care of necessary synchronization and communication such that output from the frames is collected and output in the correct order. See Analysis output data handling for an overview of the high-level functionality and the terminology used.

This module consists of two main parts. The first is formed by the gmx::AbstractAnalysisData class and classes that derive from it: gmx::AnalysisData and gmx::AnalysisArrayData. These classes are used to process and store raw data as produced by the analysis tool. They also provide an interface to attach data modules that implement gmx::IAnalysisDataModule.

Modules that implement gmx::IAnalysisDataModule form the second part of the module, and they provide functionality to do processing on the data. These modules can also derive from gmx::AbstractAnalysisData, allowing other modules to be attached to them to form a processing chain that best suits the analysis tool. Typically, such a processing chain ends in a plotting module that writes the data into a file, but the final module can also provide direct access to the processed data, allowing the analysis tool to do custom postprocessing outside the module framework.

Using Data Objects and Modules

To use the functionality in this module, you typically declare one or more AnalysisData objects and set its properties. You then create some module objects and set their properties (see the list of classes that implement gmx::IAnalysisDataModule) and attach them to the data objects or to one another using gmx::AbstractAnalysisData::addModule(). Then you add the actual data values to the gmx::AnalysisData object, which automatically passes it on to the modules. After all data is added, you may optionally access some results directly from the module objects or from the gmx::AnalysisData object itself. However, in many cases it is sufficient to initially add a plotting module to the processing chain, which will then automatically write the results into a file.

For simple processing needs with a small amount of data, an gmx::AnalysisArrayData class is also provided, which keeps all the data in an in-memory array and allows you to manipulate the data as you wish before you pass the data to the attached modules.

Data Modules

Modules that derive from gmx::IAnalysisDataModule can operate in two modes:

The figure below shows the sequence of callbacks that the module receives. Arrows show a dependency between callbacks: the event at the start of the arrow always occurs before the event at the end. The events in the box are repeated for each frame. Dashed lines within this box show dependencies between these frames:

If the input data supports parallel mode, it calls parallelDataStarted(). If the module returns true from this method, then it will process the frames in the parallel mode. If the module returns false, it will get the frames in serial order. If the input data does not support parallel mode, it calls dataStarted(), and the module will always get the frames in order.

The sequence of when the module methods are called with respect to when data is added to the data object depends on the type of the module and the type of the data. However, generally the modules do not need to know the details of how this happens, as long as they work with the above state diagram.

For parallel processing, the gmx::AnalysisData object itself only provides the infrastructure to support all of the above, including the reordering of the frames for serial processing. However, the caller is still responsible of the actual thread synchronization, and must call gmx::AnalysisData::finishFrameSerial() for each frame from a suitable context where the serial processing for that frame can be done. When using the data objects as part of the trajectory analysis framework (Framework for trajectory analysis) or energy analysis framework (Framework for energy analysis), these calls are handled by the framework.

Writing New Data and Module Objects

New data modules can be implemented to perform custom operations that are not supported by the modules provided in this module. This is done by creating a new class that implements gmx::IAnalysisDataModule. If the new module computes values that can be used as input for other modules, the new class should also derive from gmx::AbstractAnalysisData, and preferably use gmx::AnalysisDataStorage internally to implement storage of values. See the documentation of the mentioned classes for more details on how to implement custom modules. When implementing a new module, it should be considered whether it can be of more general use, and if so, it should be added to this module.

It is also possible to implement new data source objects by deriving a class from gmx::AbstractAnalysisData. This should not normally be necessary, since this module provides general data source objects for most typical uses. If the classes in this module are not suitable for some specific use, it should be considered whether a new generic class could be added (or an existing extended) instead of implementing a local custom solution.

Author
Teemu Murtola teemu.nosp@m..mur.nosp@m.tola@.nosp@m.gmai.nosp@m.l.com

Classes

class  gmx::AnalysisDataModuleManager
 Encapsulates handling of data modules attached to AbstractAnalysisData. More...
 
class  gmx::AnalysisDataStorageFrame
 Allows assigning values for a data frame in AnalysisDataStorage. More...
 
class  gmx::AnalysisDataStorage
 Helper class that implements storage of data. More...
 
class  gmx::AnalysisDataParallelOptions
 Parallelization options for analysis data objects. More...
 
class  gmx::test::AnalysisDataTestInputPointSet
 Represents a single set of points in AnalysisDataTestInputFrame structure. More...
 
class  gmx::test::AnalysisDataTestInputFrame
 Represents a single frame in AnalysisDataTestInput structure. More...
 
class  gmx::test::AnalysisDataTestInput
 Represents static input data for AbstractAnalysisData tests. More...
 
class  gmx::test::AnalysisDataTestFixture
 Test fixture for AbstractAnalysisData testing. More...
 
class  gmx::AbstractAnalysisData
 Abstract base class for all objects that provide data. More...
 
class  gmx::AnalysisData
 Parallelizable data container for raw data. More...
 
class  gmx::AnalysisDataHandle
 Handle for inserting data into AnalysisData. More...
 
class  gmx::AbstractAnalysisArrayData
 Abstract base class for data objects that present in-memory data. More...
 
class  gmx::AnalysisArrayData
 Simple in-memory data array. More...
 
class  gmx::AnalysisDataValue
 Value type for representing a single value in analysis data objects. More...
 
class  gmx::AnalysisDataFrameHeader
 Value type for storing frame-level information for analysis data. More...
 
class  gmx::AnalysisDataPointSetRef
 Value type wrapper for non-mutable access to a set of data column values. More...
 
class  gmx::AnalysisDataFrameRef
 Value type wrapper for non-mutable access to a data frame. More...
 
class  gmx::IAnalysisDataModule
 Interface for a module that gets notified whenever data is added. More...
 
class  gmx::AnalysisDataModuleSerial
 Convenience base class for serial analysis data modules. More...
 
class  gmx::AnalysisDataModuleParallel
 Convenience base class for parallel analysis data modules. More...
 
class  gmx::AnalysisDataAverageModule
 Data module for independently averaging each column in input data. More...
 
class  gmx::AnalysisDataFrameAverageModule
 Data module for averaging of columns for each frame. More...
 
class  gmx::AnalysisDataDisplacementModule
 Data module for calculating displacements. More...
 
class  gmx::AnalysisHistogramSettingsInitializer
 Provides "named parameter" idiom for constructing histograms. More...
 
class  gmx::AnalysisHistogramSettings
 Contains parameters that specify histogram bin locations. More...
 
class  gmx::AbstractAverageHistogram
 Base class for representing histograms averaged over frames. More...
 
class  gmx::AnalysisDataSimpleHistogramModule
 Data module for per-frame histograms. More...
 
class  gmx::AnalysisDataWeightedHistogramModule
 Data module for per-frame weighted histograms. More...
 
class  gmx::AnalysisDataBinAverageModule
 Data module for bin averages. More...
 
class  gmx::AnalysisDataLifetimeModule
 Data module for computing lifetime histograms for columns in input data. More...
 
class  gmx::AnalysisDataPlotSettings
 Common settings for data plots. More...
 
class  gmx::AbstractPlotModule
 Abstract data module for writing data into a file. More...
 
class  gmx::AnalysisDataPlotModule
 Plotting module for straightforward plotting of data. More...
 
class  gmx::AnalysisDataVectorPlotModule
 Plotting module specifically for data consisting of vectors. More...
 

Directories

directory analysisdata
 Parallelizable Handling of Output Data (analysisdata)
 
directory tests
 Unit tests for Parallelizable Handling of Output Data (analysisdata).
 

Files

file  datamodulemanager.h
 Declares gmx::AnalysisDataModuleManager.
 
file  datastorage.h
 Declares gmx::AnalysisDataStorage.
 
file  paralleloptions.h
 Declares gmx::AnalysisDataParallelOptions.
 
file  datatest.h
 Helper classes for testing classes that derive from AbstractAnalysisData.
 
file  mock_datamodule.h
 Declares mock implementation of gmx::IAnalysisDataModule.
 
file  abstractdata.h
 Declares gmx::AbstractAnalysisData.
 
file  analysisdata.h
 Declares gmx::AnalysisData and gmx::AnalysisDataHandle.
 
file  arraydata.h
 Declares gmx::AbstractAnalysisArrayData and gmx::AnalysisArrayData.
 
file  dataframe.h
 Declares classes for accessing data frame information.
 
file  datamodule.h
 Declares gmx::IAnalysisDataModule and related convenience classes.
 
file  average.h
 Declares gmx::AnalysisDataAverageModule.
 
file  displacement.h
 Declares gmx::AnalysisDataDisplacementModule.
 
file  histogram.h
 Declares analysis data modules for calculating histograms.
 
file  lifetime.h
 Declares gmx::AnalysisDataLifetimeModule.
 
file  plot.h
 Declares gmx::AnalysisDataPlotModule for plotting data (into a file).