Gromacs
2018.2
|
GROMACS installation includes a template for writing trajectory analysis tools using Framework for trajectory analysis. It can be found from share/gromacs/template/
under the installation directory, and from share/template/
in the source distribution.
The full source code for the file is also included in this documentation: template.cpp The rest of this page walks through the code to explain the different parts.
We start by including some generic C++ headers:
We then define a class that implements our analysis tool:
For the template, we do not need any custom frame-local data. If you think you need some for more complex analysis needs, see documentation of gmx::TrajectoryAnalysisModuleData for more details. If you do not care about parallelization, you do not need to consider this part. You can simply declare all variables in the module class itself, initialize them in gmx::TrajectoryAnalysisModule::initAnalysis(), and do any postprocessing in gmx::TrajectoryAnalysisModule::finishAnalysis()).
The constructor (and possible destructor) of the analysis module should be simple: the constructor should just initialize default values, and the destructor should free any memory managed by the module. For the template, we have no attributes in our class that need to be explicitly freed, so we declare only a constructor:
Initialization of the module is split into a few methods, two of which are used in the template. gmx::TrajectoryAnalysisModule::initOptions() is used to set up options understood by the module, as well as for setting up different options through gmx::TrajectoryAnalysisSettings (see the documentation of that class for more details):
For additional documentation on how to define different kinds of options, see gmx::IOptionsContainer, basicoptions.h, and gmx::SelectionOption. You only need to define options that are specific to the analysis; common options, e.g., for specifying input topology and trajectories are added by the framework.
To adjust settings or selection options (e.g., the number of accepted selections) based on option values, you need to override gmx::TrajectoryAnalysisModule::optionsFinished(). For simplicity, this is not done in the template.
The actual analysis is initialized in gmx::TrajectoryAnalysisModule::initAnalysis():
One of the main tasks of this method is to set up appropriate gmx::AnalysisData objects and modules for them (see gmx::TrajectoryAnalysisModule for the general approach). These objects will be used to process output from the tool. Their main purpose is to support parallelization, but even if you don't care about parallelism, they still provide convenient building blocks, e.g., for histogramming and file output.
For the template, we first set the cutoff for the neighborhood search.
Then, we create and register one gmx::AnalysisData object that will contain, for each frame, one column for each input selection. This will contain the main output from the tool: minimum distance between the reference selection and that particular selection. We then create and setup a module that will compute the average distance for each selection (see writeOutput() for how it is used). Finally, if an output file has been provided, we create and setup a module that will plot the per-frame distances to a file.
If the analysis module needs some temporary storage during processing of a frame (i.e., it uses a custom class derived from gmx::TrajectoryAnalysisModuleData), this should be allocated in gmx::TrajectoryAnalysisModule::startFrames() (see below) if parallelization is to be supported.
If you need to do initialization based on data from the first frame (most commonly, based on the box size), you need to override gmx::TrajectoryAnalysisModule::initAfterFirstFrame(), but this is not used in the template.
There is one more initialization method that needs to be overridden to support automatic parallelization: gmx::TrajectoryAnalysisModule::startFrames(). If you do not need custom frame-local data (or parallelization at all), you can skip this method and ignore the last parameter to gmx::TrajectoryAnalysisModule::analyzeFrame() to make things simpler. In the template, this method is not necessary.
The main part of the analysis is (in most analysis codes) done in the gmx::TrajectoryAnalysisModule::analyzeFrame() method, which is called once for each frame:
frnr
parameter gives a zero-based index of the current frame (mostly for use with gmx::AnalysisData), pbc
contains the PBC information for the current frame for distance calculations with, e.g., pbc_dx(), and pdata
points to a data structure created in startFrames(). Although usually not necessary (except for the time field), raw frame data can be accessed through fr
. In most cases, the analysis should be written such that it gets all position data through selections, and does not assume a constant size for them. This is all that is required to support the full flexibility of the selection engine.
For the template, we first get data from our custom data structure for shorthand access (if you use a custom data object, you need a static_cast
here):
We then do a simple calculation and use the AnalysisDataHandle class to set the per-frame output for the tool:
After all the frames have been processed, gmx::TrajectoryAnalysisModule::finishAnalysis() is called once. This is the place to do any custom postprocessing of the data. For the template, we do nothing, because all necessary processing is done in the data modules:
If the data structure created in gmx::TrajectoryAnalysisModule::startFrames() is used to aggregate data across frames, you need to override gmx::TrajectoryAnalysisModule::finishFrames() to combine the data from the data structures (see documentation of the method for details). This is not necessary for the template, because the ModuleData structure only contains data used during the analysis of a single frame.
Finally, most programs need to write out some values after the analysis is complete. In some cases, this can be achieved with proper chaining of data modules, but often it is necessary to do some custom processing. All such activities should be done in gmx::TrajectoryAnalysisModule::writeOutput(). This makes it easier to reuse analysis modules in, e.g., scripting languages, where output into files might not be desired. The template simply prints out the average distances for each analysis group:
avem_
module, which we initialized in initAnalysis() to aggregate the averages of the computed distances.
Now, the only thing remaining is to define the main() function. To implement a command-line tool, it should create a module and run it using gmx::TrajectoryAnalysisCommandLineRunner using the boilerplate code below: