Gromacs  2022.2
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Selection parsing

The selection parser is implemented in the following files:

The basic control flow in the parser is as follows: when a parser function in SelectionCollection gets called, it performs some initialization, and then calls the _gmx_sel_yyparse() function generated by Bison. This function then calls _gmx_sel_yylex() to repeatedly read tokens from the input (more complex tasks related to token recognition and bookkeeping are done by functions in scanner_internal.cpp) and uses the grammar rules to decide what to do with them. Whenever a grammar rule matches, a corresponding function in parsetree.cpp is called to construct either a temporary representation for the object or a gmx::SelectionTreeElement object (some simple rules are handled internally in parser.y). When a complete selection has been parsed, the functions in parsetree.cpp also take care of updating the gmx_ana_selcollection_t structure appropriately.

The rest of this page describes the resulting gmx::SelectionTreeElement object tree. Before the selections can be evaluated, this tree needs to be passed to the selection compiler, which is described on a separate page: Selection compilation

Element tree constructed by the parser

The parser initializes the following fields in all selection elements: gmx::SelectionTreeElement::name, gmx::SelectionTreeElement::type, gmx::SelectionTreeElement::v.type, gmx::SelectionTreeElement::flags, gmx::SelectionTreeElement::child, and gmx::SelectionTreeElement::next. Some other fields are also initialized for particular element types as discussed below. Fields that are not initialized are set to zero, NULL, or other similar value.

Root elements

The parser creates a SEL_ROOT selection element for each variable assignment and each selection. However, there are two exceptions that do not result in a SEL_ROOT element (in these cases, only the symbol table is modified):

The SEL_ROOT elements are linked together in a chain in the same order as in the input.

The children of the SEL_ROOT elements can be used to distinguish the two types of root elements from each other:

The name of the selection/variable is stored in gmx::SelectionTreeElement::cgrp.name. It is set to either the name provided by the user or the selection string for selections not explicitly named by the user. SEL_ROOT or SEL_SUBEXPR elements do not appear anywhere else.

Constant elements

SEL_CONST elements are created for every constant that is required for later evaluation. Currently, SEL_CONST elements can be present for

For group-valued elements, the value is stored in gmx::SelectionTreeElement::cgrp; other types of values are stored in gmx::SelectionTreeElement::v. Constants that appear as parameters for selection methods are not present in the selection tree unless they have GROUP_VALUE. SEL_CONST elements have no children.

Method evaluation elements

SEL_EXPRESSION and SEL_MODIFIER elements are treated very similarly. The gmx_ana_selmethod_t structure corresponding to the evaluation method is in gmx::SelectionTreeElement::method, and the method data in gmx::SelectionTreeElement::mdata has been allocated using sel_datafunc(). If a non-standard reference position type was set, gmx::SelectionTreeElement::pc has also been created, but only the type has been set. All children of these elements are of the type SEL_SUBEXPRREF, and each describes a selection that needs to be evaluated to obtain a value for one parameter of the method. No children are present for parameters that were given a constant non-GROUP_VALUE value. The children are sorted in the order in which the parameters appear in the gmx_ana_selmethod_t structure.

In addition to actual selection keywords, SEL_EXPRESSION elements are used internally to implement numerical comparisons (e.g., "x < 5") and keyword matching (e.g., "resnr 1 to 3" or "name CA").

Subexpression elements

SEL_SUBEXPR elements only appear for variables, as described above. gmx::SelectionTreeElement::name points to the name of the variable (from the SEL_ROOT element). The element always has exactly one child, which represents the value of the variable.

SEL_SUBEXPRREF elements are used for two purposes:

Boolean elements

One SEL_BOOLEAN element is created for each boolean keyword in the input, and the tree structure represents the evaluation order. The gmx::SelectionTreeElement::boolt type gives the type of the operation. Each element has exactly two children (one for BOOL_NOT elements), which are in the order given in the input. The children always have GROUP_VALUE, but different element types are possible.

Arithmetic elements

One SEL_ARITHMETIC element is created for each arithmetic operation in the input, and the tree structure represents the evaluation order. The gmx::SelectionTreeElement::optype type gives the name of the operation. Each element has exactly two children (one for unary negation elements), which are in the order given in the input.