Gromacs  2022.2
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
List of all members | Public Member Functions | Protected Member Functions | Friends

#include <gromacs/analysisdata/analysisdata.h>

+ Inheritance diagram for gmx::AnalysisData:
+ Collaboration diagram for gmx::AnalysisData:

Description

Parallelizable data container for raw data.

This is the main class used to implement parallelizable data processing in analysis tools. It is used by first creating an object and setting its properties using setDataSetCount(), setColumnCount() and setMultipoint(), and attaching necessary modules using addModule() etc. Then one or more AnalysisDataHandle objects can be created using startData(). Each data handle can then be independently used to provide data frames (each frame must be provided by a single handle, but different frames can be freely mixed between the handles). The finishFrameSerial() method must be called in serial for each frame, after one of the handles has been used to provide the data for that frame. When all data has been provided, the handles are destroyed using finishData() (or AnalysisDataHandle::finishData()).

When used through the trajectory analysis framework, calls to startData(), finishFrameSerial(), and finishData() are handled by the framework.

Todo:
Parallel implementation is not complete.
Examples:
template.cpp.

Public Member Functions

 AnalysisData ()
 Creates an empty analysis data object. More...
 
void setDataSetCount (int dataSetCount)
 Sets the number of data sets. More...
 
void setColumnCount (int dataSet, int columnCount)
 Sets the number of columns in a data set. More...
 
void setMultipoint (bool bMultipoint)
 Sets whether the data contains multiple points per column per frame. More...
 
int frameCount () const override
 Returns the total number of frames in the data. More...
 
AnalysisDataHandle startData (const AnalysisDataParallelOptions &opt)
 Creates a handle for adding data. More...
 
void finishFrameSerial (int frameIndex)
 Performs in-order sequential processing for the next frame. More...
 
void finishData (AnalysisDataHandle handle)
 Destroys a handle after all data has been added. More...
 
bool isMultipoint () const
 Whether the data can have multiple points in the same column in the same frame. More...
 
int dataSetCount () const
 Returns the number of data sets in the data object. More...
 
int columnCount (int dataSet) const
 Returns the number of columns in a data set. More...
 
int columnCount () const
 Returns the number of columns in the data. More...
 
AnalysisDataFrameRef tryGetDataFrame (int index) const
 Access stored data. More...
 
AnalysisDataFrameRef getDataFrame (int index) const
 Access stored data. More...
 
bool requestStorage (int nframes)
 Request storage of frames. More...
 
void addModule (const AnalysisDataModulePointer &module)
 Adds a module to process the data. More...
 
void addColumnModule (int col, int span, const AnalysisDataModulePointer &module)
 Adds a module that processes only a subset of the columns. More...
 
void applyModule (IAnalysisDataModule *module)
 Applies a module to process data that is ready. More...
 

Protected Member Functions

AnalysisDataModuleManagermoduleManager ()
 Returns the module manager to use for calling notification methods.
 
const AnalysisDataModuleManagermoduleManager () const
 Returns the module manager to use for calling notification methods.
 

Friends

class AnalysisDataHandle
 

Constructor & Destructor Documentation

gmx::AnalysisData::AnalysisData ( )

Creates an empty analysis data object.

Exceptions
std::bad_allocif out of memory.

Member Function Documentation

void gmx::AbstractAnalysisData::addColumnModule ( int  col,
int  span,
const AnalysisDataModulePointer module 
)
inherited

Adds a module that processes only a subset of the columns.

Parameters
[in]colFirst column.
[in]spanNumber of columns.
moduleModule to add.

Throws in the same situations as addModule().

Currently, all data sets are filtered using the same column mask.

Todo:
This method doesn't currently work in all cases with multipoint data or with multiple data sets. In particular, if the added module requests storage and uses getDataFrame(), it will behave unpredictably (most likely asserts).
Todo:
Generalize this method to multiple data sets (e.g., for adding modules that only process a single data set).
See Also
addModule()
void gmx::AbstractAnalysisData::addModule ( const AnalysisDataModulePointer module)
inherited

Adds a module to process the data.

Parameters
moduleModule to add.
Exceptions
std::bad_allocif out of memory.
APIErrorif
  • module is not compatible with the data object
  • data has already been added to the data object and everything is not available through getDataFrame().
unspecifiedAny exception thrown by module in its notification methods (if data has been added).

If data has already been added to the data, the new module immediately processes all existing data. APIError is thrown if all data is not available through getDataFrame().

The caller can keep a copy of the module pointer if it requires later access to the module.

If the method throws, the state of the data object is not changed. The state of the data module is indeterminate.

void gmx::AbstractAnalysisData::applyModule ( IAnalysisDataModule module)
inherited

Applies a module to process data that is ready.

Parameters
moduleModule to apply.
Exceptions
APIErrorin same situations as addModule().
unspecifiedAny exception thrown by module in its notification methods.

This function works as addModule(), except that it does not keep a reference to module within the data object after it returns. Also, it can only be called after the data is ready, and only if getDataFrame() gives access to all of the data. It is provided for additional flexibility in postprocessing in-memory data.

Todo:
Currently, this method may not work correctly if module requests storage (addModule() has the same problem if called after data is started).
int gmx::AbstractAnalysisData::columnCount ( int  dataSet) const
inherited

Returns the number of columns in a data set.

Parameters
[in]dataSetZero-based index of the data set to query.
Returns
The number of columns in the data.

If the number of columns is not yet known, returns 0. The returned value does not change after modules have been notified of data start, but may change multiple times before that, depending on the actual data class. Derived classes should set the number of columns with setColumnCount(), within the above limitations.

Does not throw.

int gmx::AbstractAnalysisData::columnCount ( ) const
inherited

Returns the number of columns in the data.

Returns
The number of columns in the data.

This is a convenience method for data objects with a single data set. Can only be called if dataSetCount() == 1.

Does not throw.

See Also
columnCount(int)
int gmx::AbstractAnalysisData::dataSetCount ( ) const
inherited

Returns the number of data sets in the data object.

Returns
The number of data sets in the data.

If the number is not yet known, returns 0. The returned value does not change after modules have been notified of data start, but may change multiple times before that, depending on the actual data class. Derived classes should set the number of columns with setDataSetCount(), within the above limitations.

Does not throw.

void gmx::AnalysisData::finishData ( AnalysisDataHandle  handle)

Destroys a handle after all data has been added.

Parameters
[in]handleHandle to destroy.
Exceptions
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::dataFinished().

handle must have been obtained from startData() of this object. The order of the calls with respect to the corresponding startData() calls is not important.

The handle (and any copies) are invalid after the call.

void gmx::AnalysisData::finishFrameSerial ( int  frameIndex)

Performs in-order sequential processing for the next frame.

Parameters
[in]frameIndexIndex of the frame that has been finished.
Exceptions
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::frameFinishedSerial().

This method should be called sequentially for each frame, after data for that frame has been produced. It is not necessary to call this method if there is no parallelism, i.e., if only a single data handle is created and the parallelization options provided at that time do not indicate parallelism.

int gmx::AnalysisData::frameCount ( ) const
overridevirtual

Returns the total number of frames in the data.

Returns
The total number of frames in the data.

This function returns the number of frames that the object has produced. If requestStorage() has been successfully called, tryGetDataframe() or getDataFrame() can be used to access some or all of these frames.

Does not throw.

Derived classes should implement this to return the number of frames. The frame count should not be incremented before tryGetDataFrameInternal() can return the new frame. The frame count must be incremented before AnalysisDataModuleManager::notifyFrameFinish() is called.

Implements gmx::AbstractAnalysisData.

AnalysisDataFrameRef gmx::AbstractAnalysisData::getDataFrame ( int  index) const
inherited

Access stored data.

Parameters
[in]indexZero-based frame index to access.
Returns
Frame reference to frame index.
Exceptions
APIErrorif the requested frame is not accessible.

If the data is not certainly available, use tryGetDataFrame().

See Also
requestStorage()
tryGetDataFrame()
bool gmx::AbstractAnalysisData::isMultipoint ( ) const
inherited

Whether the data can have multiple points in the same column in the same frame.

Returns
true if multiple points in the same column are allowed within a single frame.

This kind of data can appear in many histogramming applications (e.g., RDFs), where each trajectory frame has several data points (possibly a different number for each frame). The current interface doesn't support storing such data, but this should rarely be necessary.

The returned value does not change after modules have been notified of data start. Derived classes can change the type by calling setMultipoint() subject to the above restriction. If this is not done, the function always returns false.

Does not throw.

bool gmx::AbstractAnalysisData::requestStorage ( int  nframes)
inherited

Request storage of frames.

Parameters
[in]nframesRequest storing at least nframes previous frames (-1 = request storing all). Must be >= -1.
Returns
true if the request could be satisfied.

If called multiple times, the largest request is honored.

Does not throw. Failure to honor the request is indicated through the return value.

See Also
getDataFrame()
tryGetDataFrame()
void gmx::AnalysisData::setColumnCount ( int  dataSet,
int  columnCount 
)

Sets the number of columns in a data set.

Parameters
[in]dataSetZero-based data set index.
[in]columnCountNumber of columns in the data (must be > 0).
Exceptions
APIErrorif modules have been added that are not compatible with the new column count.

Must be called before startData() for each data set. Must not be called after startData() has been called. If called multiple times for a data set, the last call takes effect.

void gmx::AnalysisData::setDataSetCount ( int  dataSetCount)

Sets the number of data sets.

Parameters
[in]dataSetCountNumber of data sets (must be > 0).
Exceptions
std::bad_allocif out of memory.
APIErrorif modules have been added that are not compatible with the new data set count.

Must not be called after startData() has been called. If not called, a single data set is assumed. If called multiple times, the last call takes effect.

void gmx::AnalysisData::setMultipoint ( bool  bMultipoint)

Sets whether the data contains multiple points per column per frame.

Parameters
[in]bMultipointWhether the data will allow multiple points per column within a single frame.
Exceptions
APIErrorif modules have been added that are not compatible with the new setting.

If this method is not called, the data is not multipoint.

Must not be called after startData() has been called.

See Also
isMultipoint()
AnalysisDataHandle gmx::AnalysisData::startData ( const AnalysisDataParallelOptions opt)

Creates a handle for adding data.

Parameters
[in]optOptions for setting how this handle will be used.
Returns
The created handle.
Exceptions
std::bad_allocif out of memory.
APIErrorif any attached data module is not compatible.
unspecifiedAny exception thrown by attached data modules in IAnalysisDataModule::dataStarted().

The caller should retain the returned handle (or a copy of it), and pass it to finishData() after successfully adding all data. The caller should discard the returned handle if an error occurs; memory allocated for the handle will be freed when the AnalysisData object is destroyed.

The opt options should be the same for all calls to this method, and the number of calls should match the parallelization factor defined in opt.

AnalysisDataFrameRef gmx::AbstractAnalysisData::tryGetDataFrame ( int  index) const
inherited

Access stored data.

Parameters
[in]indexZero-based frame index to access.
Returns
Frame reference to frame index, or an invalid reference if no such frame is available.

Does not throw. Failure to access a frame with the given index is indicated through the return value. Negative index is allowed, and will always result in an invalid reference being returned.

See Also
requestStorage()
getDataFrame()

The documentation for this class was generated from the following files: