Gromacs  2026.0-dev-20250711-6857db4
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Public Member Functions

#include <gromacs/analysisdata/analysisdata.h>

+ Inheritance diagram for gmx::AnalysisData:
+ Collaboration diagram for gmx::AnalysisData:

Description

Parallelizable data container for raw data.

This is the main class used to implement parallelizable data processing in analysis tools. It is used by first creating an object and setting its properties using setDataSetCount(), setColumnCount() and setMultipoint(), and attaching necessary modules using addModule() etc. Then one or more AnalysisDataHandle objects can be created using startData(). Each data handle can then be independently used to provide data frames (each frame must be provided by a single handle, but different frames can be freely mixed between the handles). The finishFrameSerial() method must be called in serial for each frame, after one of the handles has been used to provide the data for that frame. When all data has been provided, the handles are destroyed using finishData() (or AnalysisDataHandle::finishData()).

When used through the trajectory analysis framework, calls to startData(), finishFrameSerial(), and finishData() are handled by the framework.

Todo:
Parallel implementation is not complete.
Examples:
template.cpp.

Public Member Functions

 AnalysisData ()
 Creates an empty analysis data object. More...
 
void setDataSetCount (size_t dataSetCount)
 Sets the number of data sets. More...
 
void setColumnCount (size_t dataSet, size_t columnCount)
 Sets the number of columns in a data set. More...
 
void setMultipoint (bool bMultipoint)
 Sets whether the data contains multiple points per column per frame. More...
 
size_t frameCount () const override
 Returns the total number of frames in the data. More...
 
AnalysisDataHandle startData (const AnalysisDataParallelOptions &opt)
 Creates a handle for adding data. More...
 
void finishFrameSerial (size_t frameIndex)
 Performs in-order sequential processing for the next frame. More...
 
void finishData (AnalysisDataHandle handle)
 Destroys a handle after all data has been added. More...
 
bool isMultipoint () const
 Whether the data can have multiple points in the same column in the same frame. More...
 
size_t dataSetCount () const
 Returns the number of data sets in the data object. More...
 
size_t columnCount (size_t dataSet) const
 Returns the number of columns in a data set. More...
 
size_t columnCount () const
 Returns the number of columns in the data. More...
 
AnalysisDataFrameRef tryGetDataFrame (size_t index) const
 Access stored data. More...
 
AnalysisDataFrameRef getDataFrame (size_t index) const
 Access stored data. More...
 
bool requestStorage (int nframes)
 Request storage of frames. More...
 
void addModule (const AnalysisDataModulePointer &module)
 Adds a module to process the data. More...
 
void addColumnModule (size_t col, size_t span, const AnalysisDataModulePointer &module)
 Adds a module that processes only a subset of the columns. More...
 
void applyModule (IAnalysisDataModule *module)
 Applies a module to process data that is ready. More...
 

Constructor & Destructor Documentation

gmx::AnalysisData::AnalysisData ( )

Creates an empty analysis data object.

Exceptions
std::bad_allocif out of memory.

Member Function Documentation

void gmx::AbstractAnalysisData::addColumnModule ( size_t  col,
size_t  span,
const AnalysisDataModulePointer module 
)
inherited

Adds a module that processes only a subset of the columns.

Parameters
[in]colFirst column.
[in]spanNumber of columns.
moduleModule to add.

Throws in the same situations as addModule().

Currently, all data sets are filtered using the same column mask.

Todo:
This method doesn't currently work in all cases with multipoint data or with multiple data sets. In particular, if the added module requests storage and uses getDataFrame(), it will behave unpredictably (most likely asserts).
Todo:
Generalize this method to multiple data sets (e.g., for adding modules that only process a single data set).
See Also
addModule()
void gmx::AbstractAnalysisData::addModule ( const AnalysisDataModulePointer module)
inherited

Adds a module to process the data.

Parameters
moduleModule to add.
Exceptions
std::bad_allocif out of memory.
APIErrorif
  • module is not compatible with the data object
  • data has already been added to the data object and everything is not available through getDataFrame().
unspecifiedAny exception thrown by module in its notification methods (if data has been added).

If data has already been added to the data, the new module immediately processes all existing data. APIError is thrown if all data is not available through getDataFrame().

The caller can keep a copy of the module pointer if it requires later access to the module.

If the method throws, the state of the data object is not changed. The state of the data module is indeterminate.

void gmx::AbstractAnalysisData::applyModule ( IAnalysisDataModule module)
inherited

Applies a module to process data that is ready.

Parameters
moduleModule to apply.
Exceptions
APIErrorin same situations as addModule().
unspecifiedAny exception thrown by module in its notification methods.

This function works as addModule(), except that it does not keep a reference to module within the data object after it returns. Also, it can only be called after the data is ready, and only if getDataFrame() gives access to all of the data. It is provided for additional flexibility in postprocessing in-memory data.

Todo:
Currently, this method may not work correctly if module requests storage (addModule() has the same problem if called after data is started).
size_t gmx::AbstractAnalysisData::columnCount ( size_t  dataSet) const
inherited

Returns the number of columns in a data set.

Parameters
[in]dataSetZero-based index of the data set to query.
Returns
The number of columns in the data.

If the number of columns is not yet known, returns 0. The returned value does not change after modules have been notified of data start, but may change multiple times before that, depending on the actual data class.

Does not throw.

size_t gmx::AbstractAnalysisData::columnCount ( ) const
inherited

Returns the number of columns in the data.

Returns
The number of columns in the data.

This is a convenience method for data objects with a single data set. Can only be called if dataSetCount() == 1.

Does not throw.

See Also
columnCount(int)
size_t gmx::AbstractAnalysisData::dataSetCount ( ) const
inherited

Returns the number of data sets in the data object.

Returns
The number of data sets in the data.

If the number is not yet known, returns 0. The returned value does not change after modules have been notified of data start, but may change multiple times before that, depending on the actual data class.

Does not throw.

void gmx::AnalysisData::finishData ( AnalysisDataHandle  handle)

Destroys a handle after all data has been added.

Parameters
[in]handleHandle to destroy.
Exceptions
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::dataFinished().

handle must have been obtained from startData() of this object. The order of the calls with respect to the corresponding startData() calls is not important.

The handle (and any copies) are invalid after the call.

void gmx::AnalysisData::finishFrameSerial ( size_t  frameIndex)

Performs in-order sequential processing for the next frame.

Parameters
[in]frameIndexIndex of the frame that has been finished.
Exceptions
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::frameFinishedSerial().

This method should be called sequentially for each frame, after data for that frame has been produced. It is not necessary to call this method if there is no parallelism, i.e., if only a single data handle is created and the parallelization options provided at that time do not indicate parallelism.

size_t gmx::AnalysisData::frameCount ( ) const
overridevirtual

Returns the total number of frames in the data.

Returns
The total number of frames in the data.

This function returns the number of frames that the object has produced. If requestStorage() has been successfully called, tryGetDataframe() or getDataFrame() can be used to access some or all of these frames.

Does not throw.

Implements gmx::AbstractAnalysisData.

AnalysisDataFrameRef gmx::AbstractAnalysisData::getDataFrame ( size_t  index) const
inherited

Access stored data.

Parameters
[in]indexZero-based frame index to access.
Returns
Frame reference to frame index.
Exceptions
APIErrorif the requested frame is not accessible.

If the data is not certainly available, use tryGetDataFrame().

See Also
requestStorage()
tryGetDataFrame()
bool gmx::AbstractAnalysisData::isMultipoint ( ) const
inherited

Whether the data can have multiple points in the same column in the same frame.

Returns
true if multiple points in the same column are allowed within a single frame.

This kind of data can appear in many histogramming applications (e.g., RDFs), where each trajectory frame has several data points (possibly a different number for each frame). The current interface doesn't support storing such data, but this should rarely be necessary.

The returned value does not change after modules have been notified of data start.

Does not throw.

bool gmx::AbstractAnalysisData::requestStorage ( int  nframes)
inherited

Request storage of frames.

Parameters
[in]nframesRequest storing at least nframes previous frames (-1 = request storing all). Must be >= -1.
Returns
true if the request could be satisfied.

If called multiple times, the largest request is honored.

Does not throw. Failure to honor the request is indicated through the return value.

See Also
getDataFrame()
tryGetDataFrame()
void gmx::AnalysisData::setColumnCount ( size_t  dataSet,
size_t  columnCount 
)

Sets the number of columns in a data set.

Parameters
[in]dataSetZero-based data set index.
[in]columnCountNumber of columns in the data (must be > 0).
Exceptions
APIErrorif modules have been added that are not compatible with the new column count.

Must be called before startData() for each data set. Must not be called after startData() has been called. If called multiple times for a data set, the last call takes effect.

void gmx::AnalysisData::setDataSetCount ( size_t  dataSetCount)

Sets the number of data sets.

Parameters
[in]dataSetCountNumber of data sets (must be > 0).
Exceptions
std::bad_allocif out of memory.
APIErrorif modules have been added that are not compatible with the new data set count.

Must not be called after startData() has been called. If not called, a single data set is assumed. If called multiple times, the last call takes effect.

void gmx::AnalysisData::setMultipoint ( bool  bMultipoint)

Sets whether the data contains multiple points per column per frame.

Parameters
[in]bMultipointWhether the data will allow multiple points per column within a single frame.
Exceptions
APIErrorif modules have been added that are not compatible with the new setting.

If this method is not called, the data is not multipoint.

Must not be called after startData() has been called.

See Also
isMultipoint()
AnalysisDataHandle gmx::AnalysisData::startData ( const AnalysisDataParallelOptions &  opt)

Creates a handle for adding data.

Parameters
[in]optOptions for setting how this handle will be used.
Returns
The created handle.
Exceptions
std::bad_allocif out of memory.
APIErrorif any attached data module is not compatible.
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::dataStarted().

The caller should retain the returned handle (or a copy of it), and pass it to finishData() after successfully adding all data. The caller should discard the returned handle if an error occurs; memory allocated for the handle will be freed when the AnalysisData object is destroyed.

The opt options should be the same for all calls to this method, and the number of calls should match the parallelization factor defined in opt.

AnalysisDataFrameRef gmx::AbstractAnalysisData::tryGetDataFrame ( size_t  index) const
inherited

Access stored data.

Parameters
[in]indexZero-based frame index to access.
Returns
Frame reference to frame index, or an invalid reference if no such frame is available.

Does not throw. Failure to access a frame with the given index is indicated through the return value. Negative index is allowed, and will always result in an invalid reference being returned.

See Also
requestStorage()
getDataFrame()

The documentation for this class was generated from the following files: