Collaboration diagram for SIMD intrinsics interface (simd):

Description

Provides an architecture-independent way of doing SIMD coding.

Overview of the SIMD implementation is provided in Single-instruction Multiple-data (SIMD) coding. The details are documented in gromacs/simd/simd.h and the reference implementation impl_reference.h.

Author: Erik Lindahl erik..nosp@m.lind.nosp@m.ahl@s.nosp@m.cili.nosp@m.felab.nosp@m..se

Namespaces
	gmx
	Generic GROMACS namespace.

	gmx::test::anonymous_namespace{simd_math.cpp}

Constant width-4 double precision SIMD types and instructions
static Simd4Double gmx_simdcall	gmx::load4 (const double *m)
	Load 4 double values from aligned memory into SIMD4 variable. More...

static void gmx_simdcall	gmx::store4 (double *m, Simd4Double a)
	Store the contents of SIMD4 double to aligned memory m. More...

static Simd4Double gmx_simdcall	gmx::load4U (const double *m)
	Load SIMD4 double from unaligned memory. More...

static void gmx_simdcall	gmx::store4U (double *m, Simd4Double a)
	Store SIMD4 double to unaligned memory. More...

static Simd4Double gmx_simdcall	gmx::simd4SetZeroD ()
	Set all SIMD4 double elements to 0. More...

static Simd4Double gmx_simdcall	gmx::operator& (Simd4Double a, Simd4Double b)
	Bitwise and for two SIMD4 double variables. More...

static Simd4Double gmx_simdcall	gmx::andNot (Simd4Double a, Simd4Double b)
	Bitwise andnot for two SIMD4 double variables. c=(~a) & b. More...

static Simd4Double gmx_simdcall	gmx::operator\| (Simd4Double a, Simd4Double b)
	Bitwise or for two SIMD4 doubles. More...

static Simd4Double gmx_simdcall	gmx::operator^ (Simd4Double a, Simd4Double b)
	Bitwise xor for two SIMD4 double variables. More...

static Simd4Double gmx_simdcall	gmx::operator+ (Simd4Double a, Simd4Double b)
	Add two double SIMD4 variables. More...

static Simd4Double gmx_simdcall	gmx::operator- (Simd4Double a, Simd4Double b)
	Subtract two SIMD4 variables. More...

static Simd4Double gmx_simdcall	gmx::operator- (Simd4Double a)
	SIMD4 floating-point negate. More...

static Simd4Double gmx_simdcall	gmx::operator* (Simd4Double a, Simd4Double b)
	Multiply two SIMD4 variables. More...

static Simd4Double gmx_simdcall	gmx::fma (Simd4Double a, Simd4Double b, Simd4Double c)
	SIMD4 Fused-multiply-add. Result is a*b+c. More...

static Simd4Double gmx_simdcall	gmx::fms (Simd4Double a, Simd4Double b, Simd4Double c)
	SIMD4 Fused-multiply-subtract. Result is a*b-c. More...

static Simd4Double gmx_simdcall	gmx::fnma (Simd4Double a, Simd4Double b, Simd4Double c)
	SIMD4 Fused-negated-multiply-add. Result is -a*b+c. More...

static Simd4Double gmx_simdcall	gmx::fnms (Simd4Double a, Simd4Double b, Simd4Double c)
	SIMD4 Fused-negated-multiply-subtract. Result is -a*b-c. More...

static Simd4Double gmx_simdcall	gmx::rsqrt (Simd4Double x)
	SIMD4 1.0/sqrt(x) lookup. More...

static Simd4Double gmx_simdcall	gmx::abs (Simd4Double a)
	SIMD4 Floating-point abs(). More...

static Simd4Double gmx_simdcall	gmx::max (Simd4Double a, Simd4Double b)
	Set each SIMD4 element to the largest from two variables. More...

static Simd4Double gmx_simdcall	gmx::min (Simd4Double a, Simd4Double b)
	Set each SIMD4 element to the largest from two variables. More...

static Simd4Double gmx_simdcall	gmx::round (Simd4Double a)
	SIMD4 Round to nearest integer value (in floating-point format). More...

static Simd4Double gmx_simdcall	gmx::trunc (Simd4Double a)
	Truncate SIMD4, i.e. round towards zero - common hardware instruction. More...

static double gmx_simdcall	gmx::dotProduct (Simd4Double a, Simd4Double b)
	Return dot product of two double precision SIMD4 variables. More...

static void gmx_simdcall	gmx::transpose (Simd4Double v0, Simd4Double v1, Simd4Double v2, Simd4Double v3)
	SIMD4 double transpose. More...

static Simd4DBool gmx_simdcall	gmx::operator== (Simd4Double a, Simd4Double b)
	a==b for SIMD4 double More...

static Simd4DBool gmx_simdcall	gmx::operator!= (Simd4Double a, Simd4Double b)
	a!=b for SIMD4 double More...

static Simd4DBool gmx_simdcall	gmx::operator< (Simd4Double a, Simd4Double b)
	a<b for SIMD4 double More...

static Simd4DBool gmx_simdcall	gmx::operator<= (Simd4Double a, Simd4Double b)
	a<=b for SIMD4 double. More...

static Simd4DBool gmx_simdcall	gmx::operator&& (Simd4DBool a, Simd4DBool b)
	Logical and on single precision SIMD4 booleans. More...

static Simd4DBool gmx_simdcall	gmx::operator\|\| (Simd4DBool a, Simd4DBool b)
	Logical or on single precision SIMD4 booleans. More...

static bool gmx_simdcall	gmx::anyTrue (Simd4DBool a)
	Returns non-zero if any of the boolean in SIMD4 a is True, otherwise 0. More...

static Simd4Double gmx_simdcall	gmx::selectByMask (Simd4Double a, Simd4DBool mask)
	Select from single precision SIMD4 variable where boolean is true. More...

static Simd4Double gmx_simdcall	gmx::selectByNotMask (Simd4Double a, Simd4DBool mask)
	Select from single precision SIMD4 variable where boolean is false. More...

static Simd4Double gmx_simdcall	gmx::blend (Simd4Double a, Simd4Double b, Simd4DBool sel)
	Vector-blend SIMD4 selection. More...

static double gmx_simdcall	gmx::reduce (Simd4Double a)
	Return sum of all elements in SIMD4 double variable. More...

Constant width-4 single precision SIMD types and instructions
static Simd4Float gmx_simdcall	gmx::load4 (const float *m)
	Load 4 float values from aligned memory into SIMD4 variable. More...

static void gmx_simdcall	gmx::store4 (float *m, Simd4Float a)
	Store the contents of SIMD4 float to aligned memory m. More...

static Simd4Float gmx_simdcall	gmx::load4U (const float *m)
	Load SIMD4 float from unaligned memory. More...

static void gmx_simdcall	gmx::store4U (float *m, Simd4Float a)
	Store SIMD4 float to unaligned memory. More...

static Simd4Float gmx_simdcall	gmx::simd4SetZeroF ()
	Set all SIMD4 float elements to 0. More...

static Simd4Float gmx_simdcall	gmx::operator& (Simd4Float a, Simd4Float b)
	Bitwise and for two SIMD4 float variables. More...

static Simd4Float gmx_simdcall	gmx::andNot (Simd4Float a, Simd4Float b)
	Bitwise andnot for two SIMD4 float variables. c=(~a) & b. More...

static Simd4Float gmx_simdcall	gmx::operator\| (Simd4Float a, Simd4Float b)
	Bitwise or for two SIMD4 floats. More...

static Simd4Float gmx_simdcall	gmx::operator^ (Simd4Float a, Simd4Float b)
	Bitwise xor for two SIMD4 float variables. More...

static Simd4Float gmx_simdcall	gmx::operator+ (Simd4Float a, Simd4Float b)
	Add two float SIMD4 variables. More...

static Simd4Float gmx_simdcall	gmx::operator- (Simd4Float a, Simd4Float b)
	Subtract two SIMD4 variables. More...

static Simd4Float gmx_simdcall	gmx::operator- (Simd4Float a)
	SIMD4 floating-point negate. More...

static Simd4Float gmx_simdcall	gmx::operator* (Simd4Float a, Simd4Float b)
	Multiply two SIMD4 variables. More...

static Simd4Float gmx_simdcall	gmx::fma (Simd4Float a, Simd4Float b, Simd4Float c)
	SIMD4 Fused-multiply-add. Result is a*b+c. More...

static Simd4Float gmx_simdcall	gmx::fms (Simd4Float a, Simd4Float b, Simd4Float c)
	SIMD4 Fused-multiply-subtract. Result is a*b-c. More...

static Simd4Float gmx_simdcall	gmx::fnma (Simd4Float a, Simd4Float b, Simd4Float c)
	SIMD4 Fused-negated-multiply-add. Result is -a*b+c. More...

static Simd4Float gmx_simdcall	gmx::fnms (Simd4Float a, Simd4Float b, Simd4Float c)
	SIMD4 Fused-negated-multiply-subtract. Result is -a*b-c. More...

static Simd4Float gmx_simdcall	gmx::rsqrt (Simd4Float x)
	SIMD4 1.0/sqrt(x) lookup. More...

static Simd4Float gmx_simdcall	gmx::abs (Simd4Float a)
	SIMD4 Floating-point fabs(). More...

static Simd4Float gmx_simdcall	gmx::max (Simd4Float a, Simd4Float b)
	Set each SIMD4 element to the largest from two variables. More...

static Simd4Float gmx_simdcall	gmx::min (Simd4Float a, Simd4Float b)
	Set each SIMD4 element to the largest from two variables. More...

static Simd4Float gmx_simdcall	gmx::round (Simd4Float a)
	SIMD4 Round to nearest integer value (in floating-point format). More...

static Simd4Float gmx_simdcall	gmx::trunc (Simd4Float a)
	Truncate SIMD4, i.e. round towards zero - common hardware instruction. More...

static float gmx_simdcall	gmx::dotProduct (Simd4Float a, Simd4Float b)
	Return dot product of two single precision SIMD4 variables. More...

static void gmx_simdcall	gmx::transpose (Simd4Float v0, Simd4Float v1, Simd4Float v2, Simd4Float v3)
	SIMD4 float transpose. More...

static Simd4FBool gmx_simdcall	gmx::operator== (Simd4Float a, Simd4Float b)
	a==b for SIMD4 float More...

static Simd4FBool gmx_simdcall	gmx::operator!= (Simd4Float a, Simd4Float b)
	a!=b for SIMD4 float More...

static Simd4FBool gmx_simdcall	gmx::operator< (Simd4Float a, Simd4Float b)
	a<b for SIMD4 float More...

static Simd4FBool gmx_simdcall	gmx::operator<= (Simd4Float a, Simd4Float b)
	a<=b for SIMD4 float. More...

static Simd4FBool gmx_simdcall	gmx::operator&& (Simd4FBool a, Simd4FBool b)
	Logical and on single precision SIMD4 booleans. More...

static Simd4FBool gmx_simdcall	gmx::operator\|\| (Simd4FBool a, Simd4FBool b)
	Logical or on single precision SIMD4 booleans. More...

static bool gmx_simdcall	gmx::anyTrue (Simd4FBool a)
	Returns non-zero if any of the boolean in SIMD4 a is True, otherwise 0. More...

static Simd4Float gmx_simdcall	gmx::selectByMask (Simd4Float a, Simd4FBool mask)
	Select from single precision SIMD4 variable where boolean is true. More...

static Simd4Float gmx_simdcall	gmx::selectByNotMask (Simd4Float a, Simd4FBool mask)
	Select from single precision SIMD4 variable where boolean is false. More...

static Simd4Float gmx_simdcall	gmx::blend (Simd4Float a, Simd4Float b, Simd4FBool sel)
	Vector-blend SIMD4 selection. More...

static float gmx_simdcall	gmx::reduce (Simd4Float a)
	Return sum of all elements in SIMD4 float variable. More...

SIMD implementation load/store operations for double precision floating point
static SimdDouble gmx_simdcall	gmx::simdLoad (const double *m, SimdDoubleTag={})
	Load GMX_SIMD_DOUBLE_WIDTH numbers from aligned memory. More...

static void gmx_simdcall	gmx::store (double *m, SimdDouble a)
	Store the contents of SIMD double variable to aligned memory m. More...

static SimdDouble gmx_simdcall	gmx::simdLoadU (const double *m, SimdDoubleTag={})
	Load SIMD double from unaligned memory. More...

static void gmx_simdcall	gmx::storeU (double *m, SimdDouble a)
	Store SIMD double to unaligned memory. More...

static SimdDouble gmx_simdcall	gmx::setZeroD ()
	Set all SIMD double variable elements to 0.0. More...

SIMD implementation load/store operations for integers (corresponding to double)
static SimdDInt32 gmx_simdcall	gmx::simdLoad (const std::int32_t *m, SimdDInt32Tag)
	Load aligned SIMD integer data, width corresponds to gmx::SimdDouble. More...

static void gmx_simdcall	gmx::store (std::int32_t *m, SimdDInt32 a)
	Store aligned SIMD integer data, width corresponds to gmx::SimdDouble. More...

static SimdDInt32 gmx_simdcall	gmx::simdLoadU (const std::int32_t *m, SimdDInt32Tag)
	Load unaligned integer SIMD data, width corresponds to gmx::SimdDouble. More...

static void gmx_simdcall	gmx::storeU (std::int32_t *m, SimdDInt32 a)
	Store unaligned SIMD integer data, width corresponds to gmx::SimdDouble. More...

static SimdDInt32 gmx_simdcall	gmx::setZeroDI ()
	Set all SIMD (double) integer variable elements to 0. More...

template<int index>
static std::int32_t gmx_simdcall	gmx::extract (SimdDInt32 a)
	Extract element with index i from gmx::SimdDInt32. More...

SIMD implementation double precision floating-point bitwise logical operations
static SimdDouble gmx_simdcall	gmx::operator& (SimdDouble a, SimdDouble b)
	Bitwise and for two SIMD double variables. More...

static SimdDouble gmx_simdcall	gmx::andNot (SimdDouble a, SimdDouble b)
	Bitwise andnot for SIMD double. More...

static SimdDouble gmx_simdcall	gmx::operator\| (SimdDouble a, SimdDouble b)
	Bitwise or for SIMD double. More...

static SimdDouble gmx_simdcall	gmx::operator^ (SimdDouble a, SimdDouble b)
	Bitwise xor for SIMD double. More...

SIMD implementation double precision floating-point arithmetics
static SimdDouble gmx_simdcall	gmx::operator+ (SimdDouble a, SimdDouble b)
	Add two double SIMD variables. More...

static SimdDouble gmx_simdcall	gmx::operator- (SimdDouble a, SimdDouble b)
	Subtract two double SIMD variables. More...

static SimdDouble gmx_simdcall	gmx::operator- (SimdDouble a)
	SIMD double precision negate. More...

static SimdDouble gmx_simdcall	gmx::operator* (SimdDouble a, SimdDouble b)
	Multiply two double SIMD variables. More...

static SimdDouble gmx_simdcall	gmx::fma (SimdDouble a, SimdDouble b, SimdDouble c)
	SIMD double Fused-multiply-add. Result is a*b+c. More...

static SimdDouble gmx_simdcall	gmx::fms (SimdDouble a, SimdDouble b, SimdDouble c)
	SIMD double Fused-multiply-subtract. Result is a*b-c. More...

static SimdDouble gmx_simdcall	gmx::fnma (SimdDouble a, SimdDouble b, SimdDouble c)
	SIMD double Fused-negated-multiply-add. Result is -a*b+c. More...

static SimdDouble gmx_simdcall	gmx::fnms (SimdDouble a, SimdDouble b, SimdDouble c)
	SIMD double Fused-negated-multiply-subtract. Result is -a*b-c. More...

static SimdDouble gmx_simdcall	gmx::rsqrt (SimdDouble x)
	double SIMD 1.0/sqrt(x) lookup. More...

static SimdDouble gmx_simdcall	gmx::rcp (SimdDouble x)
	SIMD double 1.0/x lookup. More...

static SimdDouble gmx_simdcall	gmx::maskAdd (SimdDouble a, SimdDouble b, SimdDBool m)
	Add two double SIMD variables, masked version. More...

static SimdDouble gmx_simdcall	gmx::maskzMul (SimdDouble a, SimdDouble b, SimdDBool m)
	Multiply two double SIMD variables, masked version. More...

static SimdDouble gmx_simdcall	gmx::maskzFma (SimdDouble a, SimdDouble b, SimdDouble c, SimdDBool m)
	SIMD double fused multiply-add, masked version. More...

static SimdDouble gmx_simdcall	gmx::maskzRsqrt (SimdDouble x, SimdDBool m)
	SIMD double 1.0/sqrt(x) lookup, masked version. More...

static SimdDouble gmx_simdcall	gmx::maskzRcp (SimdDouble x, SimdDBool m)
	SIMD double 1.0/x lookup, masked version. More...

static SimdDouble gmx_simdcall	gmx::abs (SimdDouble a)
	SIMD double floating-point fabs(). More...

static SimdDouble gmx_simdcall	gmx::max (SimdDouble a, SimdDouble b)
	Set each SIMD double element to the largest from two variables. More...

static SimdDouble gmx_simdcall	gmx::min (SimdDouble a, SimdDouble b)
	Set each SIMD double element to the smallest from two variables. More...

static SimdDouble gmx_simdcall	gmx::round (SimdDouble a)
	SIMD double round to nearest integer value (in floating-point format). More...

static SimdDouble gmx_simdcall	gmx::trunc (SimdDouble a)
	Truncate SIMD double, i.e. round towards zero - common hardware instruction. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::frexp (SimdDouble value, SimdDInt32 *exponent)
	Extract (integer) exponent and fraction from double precision SIMD. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::ldexp (SimdDouble value, SimdDInt32 exponent)
	Multiply a SIMD double value by the number 2 raised to an exp power. More...

static double gmx_simdcall	gmx::reduce (SimdDouble a)
	Return sum of all elements in SIMD double variable. More...

SIMD implementation double precision floating-point comparison, boolean, selection.
static SimdDBool gmx_simdcall	gmx::operator== (SimdDouble a, SimdDouble b)
	SIMD a==b for double SIMD. More...

static SimdDBool gmx_simdcall	gmx::operator!= (SimdDouble a, SimdDouble b)
	SIMD a!=b for double SIMD. More...

static SimdDBool gmx_simdcall	gmx::operator< (SimdDouble a, SimdDouble b)
	SIMD a<b for double SIMD. More...

static SimdDBool gmx_simdcall	gmx::operator<= (SimdDouble a, SimdDouble b)
	SIMD a<=b for double SIMD. More...

static SimdDBool gmx_simdcall	gmx::testBits (SimdDouble a)
	Return true if any bits are set in the single precision SIMD. More...

static SimdDBool gmx_simdcall	gmx::operator&& (SimdDBool a, SimdDBool b)
	Logical and on double precision SIMD booleans. More...

static SimdDBool gmx_simdcall	gmx::operator\|\| (SimdDBool a, SimdDBool b)
	Logical or on double precision SIMD booleans. More...

static bool gmx_simdcall	gmx::anyTrue (SimdDBool a)
	Returns non-zero if any of the boolean in SIMD a is True, otherwise 0. More...

static SimdDouble gmx_simdcall	gmx::selectByMask (SimdDouble a, SimdDBool mask)
	Select from double precision SIMD variable where boolean is true. More...

static SimdDouble gmx_simdcall	gmx::selectByNotMask (SimdDouble a, SimdDBool mask)
	Select from double precision SIMD variable where boolean is false. More...

static SimdDouble gmx_simdcall	gmx::blend (SimdDouble a, SimdDouble b, SimdDBool sel)
	Vector-blend SIMD double selection. More...

SIMD implementation integer (corresponding to double) bitwise logical operations
static SimdDInt32 gmx_simdcall	gmx::operator& (SimdDInt32 a, SimdDInt32 b)
	Integer SIMD bitwise and. More...

static SimdDInt32 gmx_simdcall	gmx::andNot (SimdDInt32 a, SimdDInt32 b)
	Integer SIMD bitwise not/complement. More...

static SimdDInt32 gmx_simdcall	gmx::operator\| (SimdDInt32 a, SimdDInt32 b)
	Integer SIMD bitwise or. More...

static SimdDInt32 gmx_simdcall	gmx::operator^ (SimdDInt32 a, SimdDInt32 b)
	Integer SIMD bitwise xor. More...

SIMD implementation integer (corresponding to double) arithmetics
static SimdDInt32 gmx_simdcall	gmx::operator+ (SimdDInt32 a, SimdDInt32 b)
	Add SIMD integers. More...

static SimdDInt32 gmx_simdcall	gmx::operator- (SimdDInt32 a, SimdDInt32 b)
	Subtract SIMD integers. More...

static SimdDInt32 gmx_simdcall	gmx::operator* (SimdDInt32 a, SimdDInt32 b)
	Multiply SIMD integers. More...

SIMD implementation integer (corresponding to double) comparisons, boolean selection
static SimdDIBool gmx_simdcall	gmx::operator== (SimdDInt32 a, SimdDInt32 b)
	Equality comparison of two integers corresponding to double values. More...

static SimdDIBool gmx_simdcall	gmx::operator< (SimdDInt32 a, SimdDInt32 b)
	Less-than comparison of two SIMD integers corresponding to double values. More...

static SimdDIBool gmx_simdcall	gmx::testBits (SimdDInt32 a)
	Check if any bit is set in each element. More...

static SimdDIBool gmx_simdcall	gmx::operator&& (SimdDIBool a, SimdDIBool b)
	Logical AND on SimdDIBool. More...

static SimdDIBool gmx_simdcall	gmx::operator\|\| (SimdDIBool a, SimdDIBool b)
	Logical OR on SimdDIBool. More...

static bool gmx_simdcall	gmx::anyTrue (SimdDIBool a)
	Returns true if any of the boolean in x is True, otherwise 0. More...

static SimdDInt32 gmx_simdcall	gmx::selectByMask (SimdDInt32 a, SimdDIBool mask)
	Select from gmx::SimdDInt32 variable where boolean is true. More...

static SimdDInt32 gmx_simdcall	gmx::selectByNotMask (SimdDInt32 a, SimdDIBool mask)
	Select from gmx::SimdDInt32 variable where boolean is false. More...

static SimdDInt32 gmx_simdcall	gmx::blend (SimdDInt32 a, SimdDInt32 b, SimdDIBool sel)
	Vector-blend SIMD integer selection. More...

SIMD implementation conversion operations
static SimdDInt32 gmx_simdcall	gmx::cvtR2I (SimdDouble a)
	Round double precision floating point to integer. More...

static SimdDInt32 gmx_simdcall	gmx::cvttR2I (SimdDouble a)
	Truncate double precision floating point to integer. More...

static SimdDouble gmx_simdcall	gmx::cvtI2R (SimdDInt32 a)
	Convert integer to double precision floating point. More...

static SimdDIBool gmx_simdcall	gmx::cvtB2IB (SimdDBool a)
	Convert from double precision boolean to corresponding integer boolean. More...

static SimdDBool gmx_simdcall	gmx::cvtIB2B (SimdDIBool a)
	Convert from integer boolean to corresponding double precision boolean. More...

static SimdDouble gmx_simdcall	gmx::cvtF2D (SimdFloat gmx_unused f)
	Convert SIMD float to double. More...

static SimdFloat gmx_simdcall	gmx::cvtD2F (SimdDouble gmx_unused d)
	Convert SIMD double to float. More...

static void gmx_simdcall	gmx::cvtF2DD (SimdFloat gmx_unused f, SimdDouble gmx_unused d0, SimdDouble gmx_unused d1)
	Convert SIMD float to double. More...

static SimdFloat gmx_simdcall	gmx::cvtDD2F (SimdDouble gmx_unused d0, SimdDouble gmx_unused d1)
	Convert SIMD double to float. More...

static SimdFInt32 gmx_simdcall	gmx::cvtR2I (SimdFloat a)
	Round single precision floating point to integer. More...

static SimdFInt32 gmx_simdcall	gmx::cvttR2I (SimdFloat a)
	Truncate single precision floating point to integer. More...

static SimdFloat gmx_simdcall	gmx::cvtI2R (SimdFInt32 a)
	Convert integer to single precision floating point. More...

static SimdFIBool gmx_simdcall	gmx::cvtB2IB (SimdFBool a)
	Convert from single precision boolean to corresponding integer boolean. More...

static SimdFBool gmx_simdcall	gmx::cvtIB2B (SimdFIBool a)
	Convert from integer boolean to corresponding single precision boolean. More...

SIMD implementation load/store operations for single precision floating point
static SimdFloat gmx_simdcall	gmx::simdLoad (const float *m, SimdFloatTag={})
	Load GMX_SIMD_FLOAT_WIDTH float numbers from aligned memory. More...

static void gmx_simdcall	gmx::store (float *m, SimdFloat a)
	Store the contents of SIMD float variable to aligned memory m. More...

static SimdFloat gmx_simdcall	gmx::simdLoadU (const float *m, SimdFloatTag={})
	Load SIMD float from unaligned memory. More...

static void gmx_simdcall	gmx::storeU (float *m, SimdFloat a)
	Store SIMD float to unaligned memory. More...

static SimdFloat gmx_simdcall	gmx::setZeroF ()
	Set all SIMD float variable elements to 0.0. More...

SIMD implementation load/store operations for integers (corresponding to float)
static SimdFInt32 gmx_simdcall	gmx::simdLoad (const std::int32_t *m, SimdFInt32Tag)
	Load aligned SIMD integer data, width corresponds to gmx::SimdFloat. More...

static void gmx_simdcall	gmx::store (std::int32_t *m, SimdFInt32 a)
	Store aligned SIMD integer data, width corresponds to gmx::SimdFloat. More...

static SimdFInt32 gmx_simdcall	gmx::simdLoadU (const std::int32_t *m, SimdFInt32Tag)
	Load unaligned integer SIMD data, width corresponds to gmx::SimdFloat. More...

static void gmx_simdcall	gmx::storeU (std::int32_t *m, SimdFInt32 a)
	Store unaligned SIMD integer data, width corresponds to gmx::SimdFloat. More...

static SimdFInt32 gmx_simdcall	gmx::setZeroFI ()
	Set all SIMD (float) integer variable elements to 0. More...

template<int index>
static std::int32_t gmx_simdcall	gmx::extract (SimdFInt32 a)
	Extract element with index i from gmx::SimdFInt32. More...

SIMD implementation single precision floating-point bitwise logical operations
static SimdFloat gmx_simdcall	gmx::operator& (SimdFloat a, SimdFloat b)
	Bitwise and for two SIMD float variables. More...

static SimdFloat gmx_simdcall	gmx::andNot (SimdFloat a, SimdFloat b)
	Bitwise andnot for SIMD float. More...

static SimdFloat gmx_simdcall	gmx::operator\| (SimdFloat a, SimdFloat b)
	Bitwise or for SIMD float. More...

static SimdFloat gmx_simdcall	gmx::operator^ (SimdFloat a, SimdFloat b)
	Bitwise xor for SIMD float. More...

SIMD implementation single precision floating-point arithmetics
static SimdFloat gmx_simdcall	gmx::operator+ (SimdFloat a, SimdFloat b)
	Add two float SIMD variables. More...

static SimdFloat gmx_simdcall	gmx::operator- (SimdFloat a, SimdFloat b)
	Subtract two float SIMD variables. More...

static SimdFloat gmx_simdcall	gmx::operator- (SimdFloat a)
	SIMD single precision negate. More...

static SimdFloat gmx_simdcall	gmx::operator* (SimdFloat a, SimdFloat b)
	Multiply two float SIMD variables. More...

static SimdFloat gmx_simdcall	gmx::fma (SimdFloat a, SimdFloat b, SimdFloat c)
	SIMD float Fused-multiply-add. Result is a*b+c. More...

static SimdFloat gmx_simdcall	gmx::fms (SimdFloat a, SimdFloat b, SimdFloat c)
	SIMD float Fused-multiply-subtract. Result is a*b-c. More...

static SimdFloat gmx_simdcall	gmx::fnma (SimdFloat a, SimdFloat b, SimdFloat c)
	SIMD float Fused-negated-multiply-add. Result is -a*b+c. More...

static SimdFloat gmx_simdcall	gmx::fnms (SimdFloat a, SimdFloat b, SimdFloat c)
	SIMD float Fused-negated-multiply-subtract. Result is -a*b-c. More...

static SimdFloat gmx_simdcall	gmx::rsqrt (SimdFloat x)
	SIMD float 1.0/sqrt(x) lookup. More...

static SimdFloat gmx_simdcall	gmx::rcp (SimdFloat x)
	SIMD float 1.0/x lookup. More...

static SimdFloat gmx_simdcall	gmx::maskAdd (SimdFloat a, SimdFloat b, SimdFBool m)
	Add two float SIMD variables, masked version. More...

static SimdFloat gmx_simdcall	gmx::maskzMul (SimdFloat a, SimdFloat b, SimdFBool m)
	Multiply two float SIMD variables, masked version. More...

static SimdFloat gmx_simdcall	gmx::maskzFma (SimdFloat a, SimdFloat b, SimdFloat c, SimdFBool m)
	SIMD float fused multiply-add, masked version. More...

static SimdFloat gmx_simdcall	gmx::maskzRsqrt (SimdFloat x, SimdFBool m)
	SIMD float 1.0/sqrt(x) lookup, masked version. More...

static SimdFloat gmx_simdcall	gmx::maskzRcp (SimdFloat x, SimdFBool m)
	SIMD float 1.0/x lookup, masked version. More...

static SimdFloat gmx_simdcall	gmx::abs (SimdFloat a)
	SIMD float Floating-point abs(). More...

static SimdFloat gmx_simdcall	gmx::max (SimdFloat a, SimdFloat b)
	Set each SIMD float element to the largest from two variables. More...

static SimdFloat gmx_simdcall	gmx::min (SimdFloat a, SimdFloat b)
	Set each SIMD float element to the smallest from two variables. More...

static SimdFloat gmx_simdcall	gmx::round (SimdFloat a)
	SIMD float round to nearest integer value (in floating-point format). More...

static SimdFloat gmx_simdcall	gmx::trunc (SimdFloat a)
	Truncate SIMD float, i.e. round towards zero - common hardware instruction. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::frexp (SimdFloat value, SimdFInt32 *exponent)
	Extract (integer) exponent and fraction from single precision SIMD. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::ldexp (SimdFloat value, SimdFInt32 exponent)
	Multiply a SIMD float value by the number 2 raised to an exp power. More...

static float gmx_simdcall	gmx::reduce (SimdFloat a)
	Return sum of all elements in SIMD float variable. More...

SIMD implementation single precision floating-point comparisons, boolean, selection.
static SimdFBool gmx_simdcall	gmx::operator== (SimdFloat a, SimdFloat b)
	SIMD a==b for single SIMD. More...

static SimdFBool gmx_simdcall	gmx::operator!= (SimdFloat a, SimdFloat b)
	SIMD a!=b for single SIMD. More...

static SimdFBool gmx_simdcall	gmx::operator< (SimdFloat a, SimdFloat b)
	SIMD a<b for single SIMD. More...

static SimdFBool gmx_simdcall	gmx::operator<= (SimdFloat a, SimdFloat b)
	SIMD a<=b for single SIMD. More...

static SimdFBool gmx_simdcall	gmx::testBits (SimdFloat a)
	Return true if any bits are set in the single precision SIMD. More...

static SimdFBool gmx_simdcall	gmx::operator&& (SimdFBool a, SimdFBool b)
	Logical and on single precision SIMD booleans. More...

static SimdFBool gmx_simdcall	gmx::operator\|\| (SimdFBool a, SimdFBool b)
	Logical or on single precision SIMD booleans. More...

static bool gmx_simdcall	gmx::anyTrue (SimdFBool a)
	Returns non-zero if any of the boolean in SIMD a is True, otherwise 0. More...

static SimdFloat gmx_simdcall	gmx::selectByMask (SimdFloat a, SimdFBool mask)
	Select from single precision SIMD variable where boolean is true. More...

static SimdFloat gmx_simdcall	gmx::selectByNotMask (SimdFloat a, SimdFBool mask)
	Select from single precision SIMD variable where boolean is false. More...

static SimdFloat gmx_simdcall	gmx::blend (SimdFloat a, SimdFloat b, SimdFBool sel)
	Vector-blend SIMD float selection. More...

SIMD implementation integer (corresponding to float) bitwise logical operations
static SimdFInt32 gmx_simdcall	gmx::operator& (SimdFInt32 a, SimdFInt32 b)
	Integer SIMD bitwise and. More...

static SimdFInt32 gmx_simdcall	gmx::andNot (SimdFInt32 a, SimdFInt32 b)
	Integer SIMD bitwise not/complement. More...

static SimdFInt32 gmx_simdcall	gmx::operator\| (SimdFInt32 a, SimdFInt32 b)
	Integer SIMD bitwise or. More...

static SimdFInt32 gmx_simdcall	gmx::operator^ (SimdFInt32 a, SimdFInt32 b)
	Integer SIMD bitwise xor. More...

SIMD implementation integer (corresponding to float) arithmetics
static SimdFInt32 gmx_simdcall	gmx::operator+ (SimdFInt32 a, SimdFInt32 b)
	Add SIMD integers. More...

static SimdFInt32 gmx_simdcall	gmx::operator- (SimdFInt32 a, SimdFInt32 b)
	Subtract SIMD integers. More...

static SimdFInt32 gmx_simdcall	gmx::operator* (SimdFInt32 a, SimdFInt32 b)
	Multiply SIMD integers. More...

SIMD implementation integer (corresponding to float) comparisons, boolean, selection
static SimdFIBool gmx_simdcall	gmx::operator== (SimdFInt32 a, SimdFInt32 b)
	Equality comparison of two integers corresponding to float values. More...

static SimdFIBool gmx_simdcall	gmx::operator< (SimdFInt32 a, SimdFInt32 b)
	Less-than comparison of two SIMD integers corresponding to float values. More...

static SimdFIBool gmx_simdcall	gmx::testBits (SimdFInt32 a)
	Check if any bit is set in each element. More...

static SimdFIBool gmx_simdcall	gmx::operator&& (SimdFIBool a, SimdFIBool b)
	Logical AND on SimdFIBool. More...

static SimdFIBool gmx_simdcall	gmx::operator\|\| (SimdFIBool a, SimdFIBool b)
	Logical OR on SimdFIBool. More...

static bool gmx_simdcall	gmx::anyTrue (SimdFIBool a)
	Returns true if any of the boolean in x is True, otherwise 0. More...

static SimdFInt32 gmx_simdcall	gmx::selectByMask (SimdFInt32 a, SimdFIBool mask)
	Select from gmx::SimdFInt32 variable where boolean is true. More...

static SimdFInt32 gmx_simdcall	gmx::selectByNotMask (SimdFInt32 a, SimdFIBool mask)
	Select from gmx::SimdFInt32 variable where boolean is false. More...

static SimdFInt32 gmx_simdcall	gmx::blend (SimdFInt32 a, SimdFInt32 b, SimdFIBool sel)
	Vector-blend SIMD integer selection. More...

Higher-level SIMD utility functions, double precision.
These include generic functions to work with triplets of data, typically coordinates, and a few utility functions to load and update data in the nonbonded kernels. These functions should be available on all implementations.
static const int	gmx::c_simdBestPairAlignmentDouble = 2
	Best alignment to use for aligned pairs of double data. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTranspose (const double base, const std::int32_t offset[], SimdDouble v0, SimdDouble v1, SimdDouble v2, SimdDouble *v3)
	Load 4 consecutive double from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 4 SIMD double variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTranspose (const double base, const std::int32_t offset[], SimdDouble v0, SimdDouble *v1)
	Load 2 consecutive double from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 2 SIMD double variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadUTranspose (const double base, const std::int32_t offset[], SimdDouble v0, SimdDouble v1, SimdDouble v2)
	Load 3 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 3 SIMD double variables. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterStoreU (double *base, const std::int32_t offset[], SimdDouble v0, SimdDouble v1, SimdDouble v2)
	Transpose and store 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterIncrU (double *base, const std::int32_t offset[], SimdDouble v0, SimdDouble v1, SimdDouble v2)
	Transpose and add 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterDecrU (double *base, const std::int32_t offset[], SimdDouble v0, SimdDouble v1, SimdDouble v2)
	Transpose and subtract 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets. More...

static void gmx_simdcall	gmx::expandScalarsToTriplets (SimdDouble scalar, SimdDouble triplets0, SimdDouble triplets1, SimdDouble *triplets2)
	Expand each element of double SIMD variable into three identical consecutive elements in three SIMD outputs. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadBySimdIntTranspose (const double base, SimdDInt32 offset, SimdDouble v0, SimdDouble v1, SimdDouble v2, SimdDouble *v3)
	Load 4 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets specified by a SIMD integer, transpose into 4 SIMD double variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadUBySimdIntTranspose (const double base, SimdDInt32 offset, SimdDouble v0, SimdDouble *v1)
	Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets (unaligned) specified by SIMD integer, transpose into 2 SIMD doubles. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadBySimdIntTranspose (const double base, SimdDInt32 offset, SimdDouble v0, SimdDouble *v1)
	Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets specified by a SIMD integer, transpose into 2 SIMD double variables. More...

static double gmx_simdcall	gmx::reduceIncr4ReturnSum (double *m, SimdDouble v0, SimdDouble v1, SimdDouble v2, SimdDouble v3)
	Reduce each of four SIMD doubles, add those values to four consecutive doubles in memory, return sum. More...

Higher-level SIMD utilities accessing partial (half-width) SIMD doubles.
See the single-precision versions for documentation. Since double precision is typically half the width of single, this double version is likely only useful with 512-bit and larger implementations.
static SimdDouble gmx_simdcall	gmx::loadDualHsimd (const double m0, const double m1)
	Load low & high parts of SIMD double from different locations. More...

static SimdDouble gmx_simdcall	gmx::loadDuplicateHsimd (const double *m)
	Load half-SIMD-width double data, spread to both halves. More...

static SimdDouble gmx_simdcall	gmx::loadU1DualHsimd (const double *m)
	Load two doubles, spread 1st in low half, 2nd in high half. More...

static void gmx_simdcall	gmx::storeDualHsimd (double m0, double m1, SimdDouble a)
	Store low & high parts of SIMD double to different locations. More...

static void gmx_simdcall	gmx::incrDualHsimd (double m0, double m1, SimdDouble a)
	Add each half of SIMD variable to separate memory adresses. More...

static void gmx_simdcall	gmx::decr3Hsimd (double *m, SimdDouble a0, SimdDouble a1, SimdDouble a2)
	Add the two halves of three SIMD doubles, subtract the sum from three half-SIMD-width consecutive doubles in memory. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTransposeHsimd (const double base0, const double base1, std::int32_t offset[], SimdDouble v0, SimdDouble v1)
	Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH/2 offsets, transpose into SIMD double (low half from base0, high from base1). More...

static double gmx_simdcall	gmx::reduceIncr4ReturnSumHsimd (double *m, SimdDouble v0, SimdDouble v1)
	Reduce the 4 half-SIMD-with doubles in 2 SIMD variables (sum halves), increment four consecutive doubles in memory, return sum. More...

static SimdDouble gmx_simdcall	gmx::loadUNDuplicate4 (const double *m)
	Load N doubles and duplicate them 4 times each. More...

static SimdDouble gmx_simdcall	gmx::load4DuplicateN (const double *m)
	Load 4 doubles and duplicate them N times each. More...

static SimdDouble gmx_simdcall	gmx::loadU4NOffset (const double *m, int offset)
	Load doubles in blocks of 4 at fixed offsets. More...

Higher-level SIMD utility functions, single precision.
These include generic functions to work with triplets of data, typically coordinates, and a few utility functions to load and update data in the nonbonded kernels. These functions should be available on all implementations, although some wide SIMD implementations (width>=8) also provide special optional versions to work with half or quarter registers to improve the performance in the nonbonded kernels.
static const int	gmx::c_simdBestPairAlignmentFloat = 2
	Best alignment to use for aligned pairs of float data. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTranspose (const float base, const std::int32_t offset[], SimdFloat v0, SimdFloat v1, SimdFloat v2, SimdFloat *v3)
	Load 4 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 4 SIMD float variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTranspose (const float base, const std::int32_t offset[], SimdFloat v0, SimdFloat *v1)
	Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 2 SIMD float variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadUTranspose (const float base, const std::int32_t offset[], SimdFloat v0, SimdFloat v1, SimdFloat v2)
	Load 3 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 3 SIMD float variables. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterStoreU (float *base, const std::int32_t offset[], SimdFloat v0, SimdFloat v1, SimdFloat v2)
	Transpose and store 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterIncrU (float *base, const std::int32_t offset[], SimdFloat v0, SimdFloat v1, SimdFloat v2)
	Transpose and add 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets. More...

template<int align>
static void gmx_simdcall	gmx::transposeScatterDecrU (float *base, const std::int32_t offset[], SimdFloat v0, SimdFloat v1, SimdFloat v2)
	Transpose and subtract 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets. More...

static void gmx_simdcall	gmx::expandScalarsToTriplets (SimdFloat scalar, SimdFloat triplets0, SimdFloat triplets1, SimdFloat *triplets2)
	Expand each element of float SIMD variable into three identical consecutive elements in three SIMD outputs. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadBySimdIntTranspose (const float base, SimdFInt32 offset, SimdFloat v0, SimdFloat v1, SimdFloat v2, SimdFloat *v3)
	Load 4 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets specified by a SIMD integer, transpose into 4 SIMD float variables. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadUBySimdIntTranspose (const float base, SimdFInt32 offset, SimdFloat v0, SimdFloat *v1)
	Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets (unaligned) specified by SIMD integer, transpose into 2 SIMD floats. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadBySimdIntTranspose (const float base, SimdFInt32 offset, SimdFloat v0, SimdFloat *v1)
	Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets specified by a SIMD integer, transpose into 2 SIMD float variables. More...

static float gmx_simdcall	gmx::reduceIncr4ReturnSum (float *m, SimdFloat v0, SimdFloat v1, SimdFloat v2, SimdFloat v3)
	Reduce each of four SIMD floats, add those values to four consecutive floats in memory, return sum. More...

Higher-level SIMD utilities accessing partial (half-width) SIMD floats.
These functions are optional. The are only useful for SIMD implementation where the width is 8 or larger, and where it would be inefficient to process 48, 88, or more, interactions in parallel. Currently, only Intel provides very wide SIMD implementations, but these also come with excellent support for loading, storing, accessing and shuffling parts of the register in so-called 'lanes' of 4 bytes each. We can use this to load separate parts into the low/high halves of the register in the inner loop of the nonbonded kernel, which e.g. makes it possible to process 44 nonbonded interactions as a pattern of 28. We can also use implementations with width 16 or greater. To make this more generic, when GMX_SIMD_HAVE_HSIMD_UTIL_REAL is 1, the SIMD implementation provides seven special routines that: Load the low/high parts of a SIMD variable from different pointers Load half the SIMD width from one pointer, and duplicate in low/high parts Load two reals, put 1st one in all low elements, and 2nd in all high ones. Store the low/high parts of a SIMD variable to different pointers Subtract both SIMD halves from a single half-SIMD-width memory location. Load aligned pairs (LJ parameters) from two base pointers, with a common offset list, and put these in the low/high SIMD halves. Reduce each half of two SIMD registers (i.e., 4 parts in total), increment four adjacent memory positions, and return the total sum. Remember: this is ONLY used when the native SIMD width is large. You will just waste time if you implement it for normal 16-byte SIMD architectures. This is part of the new C++ SIMD interface, so these functions are only available when using C++. Since some Gromacs code reliying on the SIMD module is still C (not C++), we have kept the C-style naming for now - this will change once we are entirely C++.
static SimdFloat gmx_simdcall	gmx::loadDualHsimd (const float m0, const float m1)
	Load low & high parts of SIMD float from different locations. More...

static SimdFloat gmx_simdcall	gmx::loadDuplicateHsimd (const float *m)
	Load half-SIMD-width float data, spread to both halves. More...

static SimdFloat gmx_simdcall	gmx::loadU1DualHsimd (const float *m)
	Load two floats, spread 1st in low half, 2nd in high half. More...

static void gmx_simdcall	gmx::storeDualHsimd (float m0, float m1, SimdFloat a)
	Store low & high parts of SIMD float to different locations. More...

static void gmx_simdcall	gmx::incrDualHsimd (float m0, float m1, SimdFloat a)
	Add each half of SIMD variable to separate memory adresses. More...

static void gmx_simdcall	gmx::decr3Hsimd (float *m, SimdFloat a0, SimdFloat a1, SimdFloat a2)
	Add the two halves of three SIMD floats, subtract the sum from three half-SIMD-width consecutive floats in memory. More...

template<int align>
static void gmx_simdcall	gmx::gatherLoadTransposeHsimd (const float base0, const float base1, const std::int32_t offset[], SimdFloat v0, SimdFloat v1)
	Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH/2 offsets, transpose into SIMD float (low half from base0, high from base1). More...

static float gmx_simdcall	gmx::reduceIncr4ReturnSumHsimd (float *m, SimdFloat v0, SimdFloat v1)
	Reduce the 4 half-SIMD-with floats in 2 SIMD variables (sum halves), increment four consecutive floats in memory, return sum. More...

static SimdFloat gmx_simdcall	gmx::loadUNDuplicate4 (const float *m)
	Load N floats and duplicate them 4 times each. More...

static SimdFloat gmx_simdcall	gmx::load4DuplicateN (const float *m)
	Load 4 floats and duplicate them N times each. More...

static SimdFloat gmx_simdcall	gmx::loadU4NOffset (const float *m, int offset)
	Load floats in blocks of 4 at fixed offsets. More...

SIMD predefined macros to describe high-level capabilities
These macros are used to describe the features available in default Gromacs real precision. They are set from the lower-level implementation files that have macros describing single and double precision individually, as well as the implementation details.
#define	GMX_SIMD_HAVE_REAL GMX_SIMD_HAVE_FLOAT
	1 if SimdReal is available, otherwise 0. More...

#define	GMX_SIMD_REAL_WIDTH GMX_SIMD_FLOAT_WIDTH
	Width of SimdReal. More...

#define	GMX_SIMD_HAVE_INT32_EXTRACT GMX_SIMD_HAVE_FINT32_EXTRACT
	1 if support is available for extracting elements from SimdInt32, otherwise 0 More...

#define	GMX_SIMD_HAVE_INT32_LOGICAL GMX_SIMD_HAVE_FINT32_LOGICAL
	1 if logical ops are supported on SimdInt32, otherwise 0. More...

#define	GMX_SIMD_HAVE_INT32_ARITHMETICS GMX_SIMD_HAVE_FINT32_ARITHMETICS
	1 if arithmetic ops are supported on SimdInt32, otherwise 0. More...

#define	GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_REAL GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_FLOAT
	1 if gmx::simdGatherLoadUBySimdIntTranspose is present, otherwise 0 More...

#define	GMX_SIMD_HAVE_HSIMD_UTIL_REAL GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT
	1 if real half-register load/store/reduce utils present, otherwise 0 More...

#define	GMX_SIMD4_HAVE_REAL GMX_SIMD4_HAVE_FLOAT
	1 if Simd4Real is available, otherwise 0. More...

Single precision SIMD math functions
Note In most cases you should use the real-precision functions instead.
static SimdFloat gmx_simdcall	gmx::copysign (SimdFloat x, SimdFloat y)
	Composes floating point value with the magnitude of x and the sign of y. More...

static SimdFloat gmx_simdcall	gmx::rsqrtIter (SimdFloat lu, SimdFloat x)
	Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD float. More...

static SimdFloat gmx_simdcall	gmx::invsqrt (SimdFloat x)
	Calculate 1/sqrt(x) for SIMD float. More...

static void gmx_simdcall	gmx::invsqrtPair (SimdFloat x0, SimdFloat x1, SimdFloat out0, SimdFloat out1)
	Calculate 1/sqrt(x) for two SIMD floats. More...

static SimdFloat gmx_simdcall	gmx::rcpIter (SimdFloat lu, SimdFloat x)
	Perform one Newton-Raphson iteration to improve 1/x for SIMD float. More...

static SimdFloat gmx_simdcall	gmx::inv (SimdFloat x)
	Calculate 1/x for SIMD float. More...

static SimdFloat gmx_simdcall	gmx::operator/ (SimdFloat nom, SimdFloat denom)
	Division for SIMD floats. More...

static SimdFloat	gmx::maskzInvsqrt (SimdFloat x, SimdFBool m)
	Calculate 1/sqrt(x) for masked entries of SIMD float. More...

static SimdFloat gmx_simdcall	gmx::maskzInv (SimdFloat x, SimdFBool m)
	Calculate 1/x for SIMD float, masked version. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::sqrt (SimdFloat x)
	Calculate sqrt(x) for SIMD floats. More...

static SimdFloat gmx_simdcall	gmx::cbrt (SimdFloat x)
	Cube root for SIMD floats. More...

static SimdFloat gmx_simdcall	gmx::invcbrt (SimdFloat x)
	Inverse cube root for SIMD floats. More...

static SimdFloat gmx_simdcall	gmx::log2 (SimdFloat x)
	SIMD float log2(x). This is the base-2 logarithm. More...

static SimdFloat gmx_simdcall	gmx::log (SimdFloat x)
	SIMD float log(x). This is the natural logarithm. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::exp2 (SimdFloat x)
	SIMD float 2^x. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::exp (SimdFloat x)
	SIMD float exp(x). More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::pow (SimdFloat x, SimdFloat y)
	SIMD float pow(x,y) More...

static SimdFloat gmx_simdcall	gmx::erf (SimdFloat x)
	SIMD float erf(x). More...

static SimdFloat gmx_simdcall	gmx::erfc (SimdFloat x)
	SIMD float erfc(x). More...

static void gmx_simdcall	gmx::sincos (SimdFloat x, SimdFloat sinval, SimdFloat cosval)
	SIMD float sin & cos. More...

static SimdFloat gmx_simdcall	gmx::sin (SimdFloat x)
	SIMD float sin(x). More...

static SimdFloat gmx_simdcall	gmx::cos (SimdFloat x)
	SIMD float cos(x). More...

static SimdFloat gmx_simdcall	gmx::tan (SimdFloat x)
	SIMD float tan(x). More...

static SimdFloat gmx_simdcall	gmx::asin (SimdFloat x)
	SIMD float asin(x). More...

static SimdFloat gmx_simdcall	gmx::acos (SimdFloat x)
	SIMD float acos(x). More...

static SimdFloat gmx_simdcall	gmx::atan (SimdFloat x)
	SIMD float asin(x). More...

static SimdFloat gmx_simdcall	gmx::atan2 (SimdFloat y, SimdFloat x)
	SIMD float atan2(y,x). More...

static SimdFloat gmx_simdcall	gmx::pmeForceCorrection (SimdFloat z2)
	Calculate the force correction due to PME analytically in SIMD float. More...

static SimdFloat gmx_simdcall	gmx::pmePotentialCorrection (SimdFloat z2)
	Calculate the potential correction due to PME analytically in SIMD float. More...

Double precision SIMD math functions
Note In most cases you should use the real-precision functions instead.
static SimdDouble gmx_simdcall	gmx::copysign (SimdDouble x, SimdDouble y)
	Composes floating point value with the magnitude of x and the sign of y. More...

static SimdDouble gmx_simdcall	gmx::rsqrtIter (SimdDouble lu, SimdDouble x)
	Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD double. More...

static SimdDouble gmx_simdcall	gmx::invsqrt (SimdDouble x)
	Calculate 1/sqrt(x) for SIMD double. More...

static void gmx_simdcall	gmx::invsqrtPair (SimdDouble x0, SimdDouble x1, SimdDouble out0, SimdDouble out1)
	Calculate 1/sqrt(x) for two SIMD doubles. More...

static SimdDouble gmx_simdcall	gmx::rcpIter (SimdDouble lu, SimdDouble x)
	Perform one Newton-Raphson iteration to improve 1/x for SIMD double. More...

static SimdDouble gmx_simdcall	gmx::inv (SimdDouble x)
	Calculate 1/x for SIMD double. More...

static SimdDouble gmx_simdcall	gmx::operator/ (SimdDouble nom, SimdDouble denom)
	Division for SIMD doubles. More...

static SimdDouble	gmx::maskzInvsqrt (SimdDouble x, SimdDBool m)
	Calculate 1/sqrt(x) for masked entries of SIMD double. More...

static SimdDouble gmx_simdcall	gmx::maskzInv (SimdDouble x, SimdDBool m)
	Calculate 1/x for SIMD double, masked version. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::sqrt (SimdDouble x)
	Calculate sqrt(x) for SIMD doubles. More...

static SimdDouble gmx_simdcall	gmx::cbrt (SimdDouble x)
	Cube root for SIMD doubles. More...

static SimdDouble gmx_simdcall	gmx::invcbrt (SimdDouble x)
	Inverse cube root for SIMD doubles. More...

static SimdDouble gmx_simdcall	gmx::log2 (SimdDouble x)
	SIMD double log2(x). This is the base-2 logarithm. More...

static SimdDouble gmx_simdcall	gmx::log (SimdDouble x)
	SIMD double log(x). This is the natural logarithm. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::exp2 (SimdDouble x)
	SIMD double 2^x. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::exp (SimdDouble x)
	SIMD double exp(x). More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::pow (SimdDouble x, SimdDouble y)
	SIMD double pow(x,y) More...

static SimdDouble gmx_simdcall	gmx::erf (SimdDouble x)
	SIMD double erf(x). More...

static SimdDouble gmx_simdcall	gmx::erfc (SimdDouble x)
	SIMD double erfc(x). More...

static void gmx_simdcall	gmx::sincos (SimdDouble x, SimdDouble sinval, SimdDouble cosval)
	SIMD double sin & cos. More...

static SimdDouble gmx_simdcall	gmx::sin (SimdDouble x)
	SIMD double sin(x). More...

static SimdDouble gmx_simdcall	gmx::cos (SimdDouble x)
	SIMD double cos(x). More...

static SimdDouble gmx_simdcall	gmx::tan (SimdDouble x)
	SIMD double tan(x). More...

static SimdDouble gmx_simdcall	gmx::asin (SimdDouble x)
	SIMD double asin(x). More...

static SimdDouble gmx_simdcall	gmx::acos (SimdDouble x)
	SIMD double acos(x). More...

static SimdDouble gmx_simdcall	gmx::atan (SimdDouble x)
	SIMD double asin(x). More...

static SimdDouble gmx_simdcall	gmx::atan2 (SimdDouble y, SimdDouble x)
	SIMD double atan2(y,x). More...

static SimdDouble gmx_simdcall	gmx::pmeForceCorrection (SimdDouble z2)
	Calculate the force correction due to PME analytically in SIMD double. More...

static SimdDouble gmx_simdcall	gmx::pmePotentialCorrection (SimdDouble z2)
	Calculate the potential correction due to PME analytically in SIMD double. More...

SIMD math functions for double prec. data, single prec. accuracy
Note In some cases we do not need full double accuracy of individual SIMD math functions, although the data is stored in double precision SIMD registers. This might be the case for special algorithms, or if the architecture does not support single precision. Since the full double precision evaluation of math functions typically require much more expensive polynomial approximations these functions implement the algorithms used in the single precision SIMD math functions, but they operate on double precision SIMD variables.
static SimdDouble gmx_simdcall	gmx::invsqrtSingleAccuracy (SimdDouble x)
	Calculate 1/sqrt(x) for SIMD double, but in single accuracy. More...

static SimdDouble	gmx::maskzInvsqrtSingleAccuracy (SimdDouble x, SimdDBool m)
	1/sqrt(x) for masked-in entries of SIMD double, but in single accuracy. More...

static void gmx_simdcall	gmx::invsqrtPairSingleAccuracy (SimdDouble x0, SimdDouble x1, SimdDouble out0, SimdDouble out1)
	Calculate 1/sqrt(x) for two SIMD doubles, but single accuracy. More...

static SimdDouble gmx_simdcall	gmx::invSingleAccuracy (SimdDouble x)
	Calculate 1/x for SIMD double, but in single accuracy. More...

static SimdDouble gmx_simdcall	gmx::maskzInvSingleAccuracy (SimdDouble x, SimdDBool m)
	1/x for masked entries of SIMD double, single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::sqrtSingleAccuracy (SimdDouble x)
	Calculate sqrt(x) (correct for 0.0) for SIMD double, with single accuracy. More...

static SimdDouble gmx_simdcall	gmx::cbrtSingleAccuracy (SimdDouble x)
	Cube root for SIMD doubles, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::invcbrtSingleAccuracy (SimdDouble x)
	Inverse cube root for SIMD doubles, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::log2SingleAccuracy (SimdDouble x)
	SIMD log2(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::logSingleAccuracy (SimdDouble x)
	SIMD log(x). Double precision SIMD data, single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::exp2SingleAccuracy (SimdDouble x)
	SIMD 2^x. Double precision SIMD, single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::expSingleAccuracy (SimdDouble x)
	SIMD exp(x). Double precision SIMD, single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdDouble gmx_simdcall	gmx::powSingleAccuracy (SimdDouble x, SimdDouble y)
	SIMD pow(x,y). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::erfSingleAccuracy (SimdDouble x)
	SIMD erf(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::erfcSingleAccuracy (SimdDouble x)
	SIMD erfc(x). Double precision SIMD data, single accuracy. More...

static void gmx_simdcall	gmx::sinCosSingleAccuracy (SimdDouble x, SimdDouble sinval, SimdDouble cosval)
	SIMD sin & cos. Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::sinSingleAccuracy (SimdDouble x)
	SIMD sin(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::cosSingleAccuracy (SimdDouble x)
	SIMD cos(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::tanSingleAccuracy (SimdDouble x)
	SIMD tan(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::asinSingleAccuracy (SimdDouble x)
	SIMD asin(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::acosSingleAccuracy (SimdDouble x)
	SIMD acos(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::atanSingleAccuracy (SimdDouble x)
	SIMD asin(x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::atan2SingleAccuracy (SimdDouble y, SimdDouble x)
	SIMD atan2(y,x). Double precision SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::pmeForceCorrectionSingleAccuracy (SimdDouble z2)
	Analytical PME force correction, double SIMD data, single accuracy. More...

static SimdDouble gmx_simdcall	gmx::pmePotentialCorrectionSingleAccuracy (SimdDouble z2)
	Analytical PME potential correction, double SIMD data, single accuracy. More...

SIMD4 math functions
Note Only a subset of the math functions are implemented for SIMD4.
static Simd4Float gmx_simdcall	gmx::rsqrtIter (Simd4Float lu, Simd4Float x)
	Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD4 float. More...

static Simd4Float gmx_simdcall	gmx::invsqrt (Simd4Float x)
	Calculate 1/sqrt(x) for SIMD4 float. More...

static Simd4Double gmx_simdcall	gmx::rsqrtIter (Simd4Double lu, Simd4Double x)
	Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD4 double. More...

static Simd4Double gmx_simdcall	gmx::invsqrt (Simd4Double x)
	Calculate 1/sqrt(x) for SIMD4 double. More...

static Simd4Double gmx_simdcall	gmx::invsqrtSingleAccuracy (Simd4Double x)
	Calculate 1/sqrt(x) for SIMD4 double, but in single accuracy. More...

Classes
class	gmx::Simd4Double
	SIMD4 double type. More...

class	gmx::Simd4DBool
	SIMD4 variable type to use for logical comparisons on doubles. More...

class	gmx::Simd4Float
	SIMD4 float type. More...

class	gmx::Simd4FBool
	SIMD4 variable type to use for logical comparisons on floats. More...

class	gmx::SimdDouble
	Double SIMD variable. Available if GMX_SIMD_HAVE_DOUBLE is 1. More...

class	gmx::SimdDInt32
	Integer SIMD variable type to use for conversions to/from double. More...

class	gmx::SimdDBool
	Boolean type for double SIMD data. More...

class	gmx::SimdDIBool
	Boolean type for integer datatypes corresponding to double SIMD. More...

class	gmx::SimdFloat
	Float SIMD variable. Available if GMX_SIMD_HAVE_FLOAT is 1. More...

class	gmx::SimdFInt32
	Integer SIMD variable type to use for conversions to/from float. More...

class	gmx::SimdFBool
	Boolean type for float SIMD data. More...

class	gmx::SimdFIBool
	Boolean type for integer datatypes corresponding to float SIMD. More...

class	gmx::test::SimdBaseTest
	Base class for SIMD test fixtures. More...

class	gmx::test::SimdTest
	Test fixture for SIMD tests. More...

class	gmx::test::Simd4Test
	Test fixture for SIMD4 tests - contains test settings. More...

class	gmx::test::anonymous_namespace{simd_floatingpoint_util.cpp}::SimdFloatingpointUtilTest
	Test fixture for higher-level floating-point utility functions. More...

Macros
#define	GMX_EXPECT_SIMD_REAL_EQ(ref, tst) EXPECT_PRED_FORMAT2(compareSimdEq, ref, tst)
	Test if a SIMD real is bitwise identical to reference SIMD value.

#define	GMX_EXPECT_SIMD_EQ(ref, tst) EXPECT_PRED_FORMAT2(compareSimdEq, ref, tst)
	Test if a SIMD is bitwise identical to reference SIMD value.

#define	GMX_EXPECT_SIMD_REAL_NEAR(ref, tst) EXPECT_PRED_FORMAT2(compareSimdRealUlp, ref, tst)
	Test if a SIMD real is within tolerance of reference SIMD value.

#define	GMX_EXPECT_SIMD_INT_EQ(ref, tst) EXPECT_PRED_FORMAT2(compareSimdEq, ref, tst)
	Macro that checks SIMD integer expression against SIMD or reference int. More...

#define	GMX_EXPECT_SIMD4_REAL_EQ(ref, tst) EXPECT_PRED_FORMAT2(compareSimd4RealEq, ref, tst)
	Test if a SIMD4 real is bitwise identical to reference SIMD4 value.

#define	GMX_EXPECT_SIMD4_REAL_NEAR(ref, tst) EXPECT_PRED_FORMAT2(compareSimd4RealUlp, ref, tst)
	Test if a SIMD4 real is within tolerance of reference SIMD4 value.

#define	GMX_EXPECT_SIMD_FUNC_NEAR(refFunc, tstFunc, compareSettings) EXPECT_PRED_FORMAT3(compareSimdMathFunction, refFunc, tstFunc, compareSettings)
	Test approximate equality of SIMD vs reference version of a function. More...

Typedefs
typedef Simd4Test	gmx::test::anonymous_namespace{simd4_floatingpoint.cpp}::Simd4FloatingpointTest
	Test fixture for SIMD4 floating-point operations (identical to the SIMD4 Simd4Test)

typedef Simd4Test	gmx::test::anonymous_namespace{simd4_vector_operations.cpp}::Simd4VectorOperationsTest
	Test fixture for SIMD4 vector operations (identical to the SIMD4 Simd4Test)

typedef SimdTest	gmx::test::anonymous_namespace{simd_floatingpoint.cpp}::SimdFloatingpointTest
	Test fixture for floating-point tests (identical to the generic SimdTest)

typedef SimdTest	gmx::test::anonymous_namespace{simd_integer.cpp}::SimdIntegerTest
	Test fixture for integer tests (identical to the generic SimdTest)

typedef SimdTest	gmx::test::anonymous_namespace{simd_vector_operations.cpp}::SimdVectorOperationsTest
	Test fixture for vector operations tests (identical to the generic SimdTest) More...

Functions
static SimdFloat gmx_simdcall	gmx::invsqrtSingleAccuracy (SimdFloat x)
	Calculate 1/sqrt(x) for SIMD float, only targeting single accuracy. More...

static SimdFloat	gmx::maskzInvsqrtSingleAccuracy (SimdFloat x, SimdFBool m)
	Calculate 1/sqrt(x) for masked SIMD floats, only targeting single accuracy. More...

static void gmx_simdcall	gmx::invsqrtPairSingleAccuracy (SimdFloat x0, SimdFloat x1, SimdFloat out0, SimdFloat out1)
	Calculate 1/sqrt(x) for two SIMD floats, only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::invSingleAccuracy (SimdFloat x)
	Calculate 1/x for SIMD float, only targeting single accuracy. More...

static SimdFloat	gmx::maskzInvSingleAccuracy (SimdFloat x, SimdFBool m)
	Calculate 1/x for masked SIMD floats, only targeting single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::sqrtSingleAccuracy (SimdFloat x)
	Calculate sqrt(x) for SIMD float, always targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::cbrtSingleAccuracy (SimdFloat x)
	Calculate cbrt(x) for SIMD float, always targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::invcbrtSingleAccuracy (SimdFloat x)
	Calculate 1/cbrt(x) for SIMD float, always targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::log2SingleAccuracy (SimdFloat x)
	SIMD float log2(x), only targeting single accuracy. This is the base-2 logarithm. More...

static SimdFloat gmx_simdcall	gmx::logSingleAccuracy (SimdFloat x)
	SIMD float log(x), only targeting single accuracy. This is the natural logarithm. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::exp2SingleAccuracy (SimdFloat x)
	SIMD float 2^x, only targeting single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::expSingleAccuracy (SimdFloat x)
	SIMD float e^x, only targeting single accuracy. More...

template<MathOptimization opt = MathOptimization::Safe>
static SimdFloat gmx_simdcall	gmx::powSingleAccuracy (SimdFloat x, SimdFloat y)
	SIMD pow(x,y), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::erfSingleAccuracy (SimdFloat x)
	SIMD float erf(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::erfcSingleAccuracy (SimdFloat x)
	SIMD float erfc(x), only targeting single accuracy. More...

static void gmx_simdcall	gmx::sinCosSingleAccuracy (SimdFloat x, SimdFloat sinval, SimdFloat cosval)
	SIMD float sin & cos, only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::sinSingleAccuracy (SimdFloat x)
	SIMD float sin(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::cosSingleAccuracy (SimdFloat x)
	SIMD float cos(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::tanSingleAccuracy (SimdFloat x)
	SIMD float tan(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::asinSingleAccuracy (SimdFloat x)
	SIMD float asin(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::acosSingleAccuracy (SimdFloat x)
	SIMD float acos(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::atanSingleAccuracy (SimdFloat x)
	SIMD float atan(x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::atan2SingleAccuracy (SimdFloat y, SimdFloat x)
	SIMD float atan2(y,x), only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::pmeForceCorrectionSingleAccuracy (SimdFloat z2)
	SIMD Analytic PME force correction, only targeting single accuracy. More...

static SimdFloat gmx_simdcall	gmx::pmePotentialCorrectionSingleAccuracy (SimdFloat z2)
	SIMD Analytic PME potential correction, only targeting single accuracy. More...

static Simd4Float gmx_simdcall	gmx::invsqrtSingleAccuracy (Simd4Float x)
	Calculate 1/sqrt(x) for SIMD4 float, only targeting single accuracy. More...

template<typename T , typename TSimd , int simdWidth>
void	gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadStoreTester (TSimd gmx_simdcall loadFn(const T mem), void gmx_simdcall storeFn(T mem, TSimd), const int loadOffset, const int storeOffset)
	Generic routine to test load & store of SIMD, and check for side effects. More...

template<typename T , typename TSimd >
TSimd gmx_simdcall	gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadWrapper (const T *m)
	Wrapper to handle proxy objects returned by some load functions. More...

template<typename T , typename TSimd >
TSimd gmx_simdcall	gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadUWrapper (const T *m)
	Wrapper to handle proxy objects returned by some loadU functions. More...

::std::vector< real >	gmx::test::simdReal2Vector (SimdReal simd)
	Convert SIMD real to std::vector<real>. More...

SimdReal	gmx::test::vector2SimdReal (const std::vector< real > &v)
	Return floating-point SIMD value from std::vector<real>. More...

SimdReal	gmx::test::setSimdRealFrom3R (real r0, real r1, real r2)
	Set SIMD register contents from three real values. More...

SimdReal	gmx::test::setSimdRealFrom1R (real value)
	Set SIMD register contents from single real value. More...

std::vector< std::int32_t >	gmx::test::simdInt2Vector (SimdInt32 simd)
	Convert SIMD integer to std::vector<int>. More...

SimdInt32	gmx::test::vector2SimdInt (const std::vector< std::int32_t > &v)
	Return 32-bit integer SIMD value from std::vector<int>. More...

SimdInt32	gmx::test::setSimdIntFrom3I (int i0, int i1, int i2)
	Set SIMD register contents from three int values. More...

SimdInt32	gmx::test::setSimdIntFrom1I (int value)
	Set SIMD register contents from single integer value. More...

::std::vector< real >	gmx::test::simd4Real2Vector (Simd4Real simd4)
	Convert SIMD4 real to std::vector<real>. More...

Simd4Real	gmx::test::vector2Simd4Real (const std::vector< real > &v)
	Return floating-point SIMD4 value from std::vector<real>. More...

Simd4Real	gmx::test::setSimd4RealFrom3R (real r0, real r1, real r2)
	Set SIMD4 register contents from three real values. More...

Simd4Real	gmx::test::setSimd4RealFrom1R (real value)
	Set SIMD4 register contents from single real value. More...

static SimdFloat gmx_simdcall	gmx::iprod (SimdFloat ax, SimdFloat ay, SimdFloat az, SimdFloat bx, SimdFloat by, SimdFloat bz)
	SIMD float inner product of multiple float vectors. More...

static SimdFloat gmx_simdcall	gmx::norm2 (SimdFloat ax, SimdFloat ay, SimdFloat az)
	SIMD float norm squared of multiple vectors. More...

static void gmx_simdcall	gmx::cprod (SimdFloat ax, SimdFloat ay, SimdFloat az, SimdFloat bx, SimdFloat by, SimdFloat bz, SimdFloat cx, SimdFloat cy, SimdFloat *cz)
	SIMD float cross-product of multiple vectors. More...

static SimdDouble gmx_simdcall	gmx::iprod (SimdDouble ax, SimdDouble ay, SimdDouble az, SimdDouble bx, SimdDouble by, SimdDouble bz)
	SIMD double inner product of multiple double vectors. More...

static SimdDouble gmx_simdcall	gmx::norm2 (SimdDouble ax, SimdDouble ay, SimdDouble az)
	SIMD double norm squared of multiple vectors. More...

static void gmx_simdcall	gmx::cprod (SimdDouble ax, SimdDouble ay, SimdDouble az, SimdDouble bx, SimdDouble by, SimdDouble bz, SimdDouble cx, SimdDouble cy, SimdDouble *cz)
	SIMD double cross-product of multiple vectors. More...

static Simd4Float gmx_simdcall	gmx::norm2 (Simd4Float ax, Simd4Float ay, Simd4Float az)
	SIMD4 float norm squared of multiple vectors. More...

static Simd4Double gmx_simdcall	gmx::norm2 (Simd4Double ax, Simd4Double ay, Simd4Double az)
	SIMD4 double norm squared of multiple vectors. More...

::testing::AssertionResult	gmx::test::SimdBaseTest::compareVectorRealUlp (const char refExpr, const char tstExpr, const std::vector< real > &ref, const std::vector< real > &tst) const
	Compare two std::vector<real> for approximate equality. More...

::testing::AssertionResult	gmx::test::SimdTest::compareSimdRealUlp (const char refExpr, const char tstExpr, SimdReal ref, SimdReal tst)
	Compare two real SIMD variables for approximate equality. More...

::testing::AssertionResult	gmx::test::SimdTest::compareSimdEq (const char refExpr, const char tstExpr, SimdReal ref, SimdReal tst)
	Compare two real SIMD variables for exact equality. More...

::testing::AssertionResult	gmx::test::SimdTest::compareSimdEq (const char refExpr, const char tstExpr, SimdInt32 ref, SimdInt32 tst)
	Compare two 32-bit integer SIMD variables. More...

::testing::AssertionResult	gmx::test::Simd4Test::compareSimd4RealUlp (const char refExpr, const char tstExpr, Simd4Real ref, Simd4Real tst)
	Compare two real SIMD4 variables for approximate equality. More...

::testing::AssertionResult	gmx::test::Simd4Test::compareSimd4RealEq (const char refExpr, const char tstExpr, Simd4Real ref, Simd4Real tst)
	Compare two real SIMD4 variables for exact equality. More...

static std::vector< real >	gmx::test::SimdMathTest::generateTestPoints (Range range, std::size_t points)
	Generate test point vector. More...

::testing::AssertionResult	gmx::test::SimdMathTest::compareSimdMathFunction (const char refFuncExpr, const char simdFuncExpr, const char *compareSettingsExpr, real refFunc(real x), SimdReal gmx_simdcall simdFunc(SimdReal x), const CompareSettings &compareSettings)
	Implementation routine to compare SIMD vs reference functions. More...

static void	gmx::test::SimdMathTest::generateTestPointsTest ()
	Test routine for the test point vector generation.

Variables
constexpr real	gmx::test::czero = 0.0
	zero

constexpr real	gmx::test::c0 = 0.3333333333333333
	test constant 0.0 + 1.0/3.0

constexpr real	gmx::test::c1 = 1.7142857142857144
	test constant 1.0 + 5.0/7.0

constexpr real	gmx::test::c2 = 2.6923076923076925
	test constant 2.0 + 9.0/13.0

constexpr real	gmx::test::c3 = 3.8947368421052633
	test constant 3.0 + 17.0/19.0

constexpr real	gmx::test::c4 = 4.793103448275862
	test constant 4.0 + 23.0/29.0

constexpr real	gmx::test::c5 = 5.837837837837838
	test constant 5.0 + 31.0/37.0

constexpr real	gmx::test::c6 = 6.953488372093023
	test constant 6.0 + 41.0/43.0

constexpr real	gmx::test::c7 = 7.886792452830189
	test constant 7.0 + 47.0/53.0

constexpr real	gmx::test::c8 = 8.967213114754099
	test constant 8.0 + 59.0/61.0

const SimdReal	gmx::test::rSimd_c0c1c2 = setSimdRealFrom3R(c0, c1, c2)
	c0,c1,c2 repeated

const SimdReal	gmx::test::rSimd_c3c4c5 = setSimdRealFrom3R(c3, c4, c5)
	c3,c4,c5 repeated

const SimdReal	gmx::test::rSimd_c6c7c8 = setSimdRealFrom3R(c6, c7, c8)
	c6,c7,c8 repeated

const SimdReal	gmx::test::rSimd_c3c0c4 = setSimdRealFrom3R(c3, c0, c4)
	c3,c0,c4 repeated

const SimdReal	gmx::test::rSimd_c4c6c8 = setSimdRealFrom3R(c4, c6, c8)
	c4,c6,c8 repeated

const SimdReal	gmx::test::rSimd_c7c2c3 = setSimdRealFrom3R(c7, c2, c3)
	c7,c2,c3 repeated

const SimdReal	gmx::test::rSimd_m0m1m2 = setSimdRealFrom3R(-c0, -c1, -c2)
	-c0,-c1,-c2 repeated

const SimdReal	gmx::test::rSimd_m3m0m4 = setSimdRealFrom3R(-c3, -c0, -c4)
	-c3,-c0,-c4 repeated

const SimdReal	gmx::test::rSimd_2p25 = setSimdRealFrom1R(2.25)
	Value that rounds down.

const SimdReal	gmx::test::rSimd_3p25 = setSimdRealFrom1R(3.25)
	Value that rounds down.

const SimdReal	gmx::test::rSimd_3p75 = setSimdRealFrom1R(3.75)
	Value that rounds up.

const SimdReal	gmx::test::rSimd_m2p25 = setSimdRealFrom1R(-2.25)
	Negative value that rounds up.

const SimdReal	gmx::test::rSimd_m3p25 = setSimdRealFrom1R(-3.25)
	Negative value that rounds up.

const SimdReal	gmx::test::rSimd_m3p75 = setSimdRealFrom1R(-3.75)
	Negative value that rounds down. More...

const SimdReal	gmx::test::rSimd_Exp
	Three large floating-point values whose exponents are >32. More...

const SimdReal	gmx::test::rSimd_logicalA = setSimdRealFrom1R(1.3333282470703125)
	Bit pattern to test logical ops.

const SimdReal	gmx::test::rSimd_logicalB = setSimdRealFrom1R(1.79998779296875)
	Bit pattern to test logical ops.

const SimdReal	gmx::test::rSimd_logicalResultAnd = setSimdRealFrom1R(1.26666259765625)
	Result or bitwise 'and' of A and B.

const SimdReal	gmx::test::rSimd_logicalResultOr = setSimdRealFrom1R(1.8666534423828125)
	Result or bitwise 'or' of A and B.

const SimdInt32	gmx::test::iSimd_1_2_3 = setSimdIntFrom3I(1, 2, 3)
	Three generic ints.

const SimdInt32	gmx::test::iSimd_4_5_6 = setSimdIntFrom3I(4, 5, 6)
	Three generic ints.

const SimdInt32	gmx::test::iSimd_7_8_9 = setSimdIntFrom3I(7, 8, 9)
	Three generic ints.

const SimdInt32	gmx::test::iSimd_5_7_9 = setSimdIntFrom3I(5, 7, 9)
	iSimd_1_2_3 + iSimd_4_5_6.

const SimdInt32	gmx::test::iSimd_1M_2M_3M = setSimdIntFrom3I(1000000, 2000000, 3000000)
	Term1 for 32bit add/sub.

const SimdInt32	gmx::test::iSimd_4M_5M_6M = setSimdIntFrom3I(4000000, 5000000, 6000000)
	Term2 for 32bit add/sub.

const SimdInt32	gmx::test::iSimd_5M_7M_9M = setSimdIntFrom3I(5000000, 7000000, 9000000)
	iSimd_1M_2M_3M + iSimd_4M_5M_6M.

const SimdInt32	gmx::test::iSimd_0xF0F0F0F0 = setSimdIntFrom1I(0xF0F0F0F0)
	Bitpattern to test integer logical operations.

const SimdInt32	gmx::test::iSimd_0xCCCCCCCC = setSimdIntFrom1I(0xCCCCCCCC)
	Bitpattern to test integer logical operations.

const SimdReal	gmx::test::rSimd_Bits1
	Pattern F0 repeated to fill single/double.

const SimdReal	gmx::test::rSimd_Bits2
	Pattern CC repeated to fill single/double.

const SimdReal	gmx::test::rSimd_Bits3
	Pattern C0 repeated to fill single/double.

const SimdReal	gmx::test::rSimd_Bits4
	Pattern 0C repeated to fill single/double.

const SimdReal	gmx::test::rSimd_Bits5
	Pattern FC repeated to fill single/double.

const SimdReal	gmx::test::rSimd_Bits6
	Pattern 3C repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_c0c1c2 = setSimd4RealFrom3R(c0, c1, c2)
	c0,c1,c2 repeated

const Simd4Real	gmx::test::rSimd4_c3c4c5 = setSimd4RealFrom3R(c3, c4, c5)
	c3,c4,c5 repeated

const Simd4Real	gmx::test::rSimd4_c6c7c8 = setSimd4RealFrom3R(c6, c7, c8)
	c6,c7,c8 repeated

const Simd4Real	gmx::test::rSimd4_c3c0c4 = setSimd4RealFrom3R(c3, c0, c4)
	c3,c0,c4 repeated

const Simd4Real	gmx::test::rSimd4_c4c6c8 = setSimd4RealFrom3R(c4, c6, c8)
	c4,c6,c8 repeated

const Simd4Real	gmx::test::rSimd4_c7c2c3 = setSimd4RealFrom3R(c7, c2, c3)
	c7,c2,c3 repeated

const Simd4Real	gmx::test::rSimd4_m0m1m2 = setSimd4RealFrom3R(-c0, -c1, -c2)
	-c0,-c1,-c2 repeated

const Simd4Real	gmx::test::rSimd4_m3m0m4 = setSimd4RealFrom3R(-c3, -c0, -c4)
	-c3,-c0,-c4 repeated

const Simd4Real	gmx::test::rSimd4_2p25 = setSimd4RealFrom1R(2.25)
	Value that rounds down.

const Simd4Real	gmx::test::rSimd4_3p75 = setSimd4RealFrom1R(3.75)
	Value that rounds up.

const Simd4Real	gmx::test::rSimd4_m2p25 = setSimd4RealFrom1R(-2.25)
	Negative value that rounds up.

const Simd4Real	gmx::test::rSimd4_m3p75 = setSimd4RealFrom1R(-3.75)
	Negative value that rounds down. More...

const Simd4Real	gmx::test::rSimd4_logicalA = setSimd4RealFrom1R(1.3333282470703125)
	Bit pattern to test logical ops.

const Simd4Real	gmx::test::rSimd4_logicalB = setSimd4RealFrom1R(1.79998779296875)
	Bit pattern to test logical ops.

const Simd4Real	gmx::test::rSimd4_logicalResultAnd = setSimd4RealFrom1R(1.26666259765625)
	Result or bitwise 'and' of A and B.

const Simd4Real	gmx::test::rSimd4_logicalResultOr = setSimd4RealFrom1R(1.8666534423828125)
	Result or bitwise 'or' of A and B.

const Simd4Real	gmx::test::rSimd4_Exp
	Three large floating-point values whose exponents are >32.

const Simd4Real	gmx::test::rSimd4_Bits1
	Pattern F0 repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_Bits2
	Pattern CC repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_Bits3
	Pattern C0 repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_Bits4
	Pattern 0C repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_Bits5
	Pattern FC repeated to fill single/double.

const Simd4Real	gmx::test::rSimd4_Bits6
	Pattern 3C repeated to fill single/double.

static int	gmx::test::SimdBaseTest::s_nPoints = 10000
	Number of test points to use, settable on command line. More...

Directories
directory	simd
	SIMD intrinsics interface (simd)

directory	tests
	Unit tests for SIMD intrinsics interface (simd).

Files
file	impl_reference.h
	Reference SIMD implementation, including SIMD documentation.

file	impl_reference_definitions.h
	Reference SIMD implementation, including SIMD documentation.

file	impl_reference_general.h
	Reference SIMD implementation, general utility functions.

file	impl_reference_simd4_double.h
	Reference implementation, SIMD4 single precision.

file	impl_reference_simd4_float.h
	Reference implementation, SIMD4 single precision.

file	impl_reference_simd_double.h
	Reference implementation, SIMD double precision.

file	impl_reference_simd_float.h
	Reference implementation, SIMD single precision.

file	impl_reference_util_double.h
	Reference impl., higher-level double prec. SIMD utility functions.

file	impl_reference_util_float.h
	Reference impl., higher-level single prec. SIMD utility functions.

file	scalar.h
	Scalar float functions corresponding to GROMACS SIMD functions.

file	scalar_math.h
	Scalar math functions mimicking GROMACS SIMD math functions.

file	scalar_util.h
	Scalar utility functions mimicking GROMACS SIMD utility functions.

file	simd.h
	Definitions, capabilities, and wrappers for SIMD module.

file	simd_math.h
	Math functions for SIMD datatypes.

file	simd_memory.h
	Declares SimdArrayRef.

file	support.cpp
	Implements SIMD architecture support query routines.

file	support.h
	Functions to query compiled and supported SIMD architectures.

file	base.h
	Declares common base class for testing SIMD and SIMD4.

file	bootstrap_loadstore.cpp
	Separate test of SIMD load/store, before we use them in the SIMD test classes.

file	data.h
	Common test data constants for SIMD, SIMD4 and scalar tests.

file	simd.h
	Declares fixture for testing of normal SIMD (not SIMD4) functionality.

file	simd4.h
	Declares fixture for testing of SIMD4 functionality.

file	simd_memory.cpp
	Tests for gmx::ArrayRef for SIMD types.

file	vector_operations.h
	SIMD operations corresponding to Gromacs rvec datatypes.

Macro Definition Documentation

#define GMX_EXPECT_SIMD_FUNC_NEAR	(	refFunc,
		tstFunc,
		compareSettings
	)	EXPECT_PRED_FORMAT3(compareSimdMathFunction, refFunc, tstFunc, compareSettings)

Test approximate equality of SIMD vs reference version of a function.

This macro takes vanilla C and SIMD flavors of a function and tests it with the number of points, range, and tolerances specified by the test fixture class.

The third option controls the range, tolerances, and match settings.

#define GMX_EXPECT_SIMD_INT_EQ	(	ref,
		tst
	)	EXPECT_PRED_FORMAT2(compareSimdEq, ref, tst)

Macro that checks SIMD integer expression against SIMD or reference int.

If the reference argument is a scalar integer it will be expanded into the width of the SIMD register and tested against all elements.

#define GMX_SIMD4_HAVE_REAL GMX_SIMD4_HAVE_FLOAT

1 if Simd4Real is available, otherwise 0.

GMX_SIMD4_HAVE_DOUBLE if GMX_DOUBLE is 1, otherwise GMX_SIMD4_HAVE_FLOAT.

#define GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_REAL GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_FLOAT

1 if gmx::simdGatherLoadUBySimdIntTranspose is present, otherwise 0

GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_DOUBLE if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE_FLOAT.

#define GMX_SIMD_HAVE_HSIMD_UTIL_REAL GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT

1 if real half-register load/store/reduce utils present, otherwise 0

GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT.

#define GMX_SIMD_HAVE_INT32_ARITHMETICS GMX_SIMD_HAVE_FINT32_ARITHMETICS

1 if arithmetic ops are supported on SimdInt32, otherwise 0.

GMX_SIMD_HAVE_DINT32_ARITHMETICS if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_FINT32_ARITHMETICS.

#define GMX_SIMD_HAVE_INT32_EXTRACT GMX_SIMD_HAVE_FINT32_EXTRACT

1 if support is available for extracting elements from SimdInt32, otherwise 0

GMX_SIMD_HAVE_DINT32_EXTRACT if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_FINT32_EXTRACT.

#define GMX_SIMD_HAVE_INT32_LOGICAL GMX_SIMD_HAVE_FINT32_LOGICAL

1 if logical ops are supported on SimdInt32, otherwise 0.

GMX_SIMD_HAVE_DINT32_LOGICAL if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_FINT32_LOGICAL.

#define GMX_SIMD_HAVE_REAL GMX_SIMD_HAVE_FLOAT

1 if SimdReal is available, otherwise 0.

GMX_SIMD_HAVE_DOUBLE if GMX_DOUBLE is 1, otherwise GMX_SIMD_HAVE_FLOAT.

#define GMX_SIMD_REAL_WIDTH GMX_SIMD_FLOAT_WIDTH

Width of SimdReal.

GMX_SIMD_DOUBLE_WIDTH if GMX_DOUBLE is 1, otherwise GMX_SIMD_FLOAT_WIDTH.

Typedef Documentation

typedef SimdTest gmx::test::anonymous_namespace{simd_vector_operations.cpp}::SimdVectorOperationsTest

Test fixture for vector operations tests (identical to the generic SimdTest)

Function Documentation

static Simd4Float gmx_simdcall gmx::abs ( Simd4Float a )

inlinestatic

SIMD4 Floating-point fabs().

Parameters

a	any floating point values

Returns: fabs(a) for each element.

static Simd4Double gmx_simdcall gmx::abs ( Simd4Double a )

inlinestatic

SIMD4 Floating-point abs().

Parameters

a	any floating point values

Returns: fabs(a) for each element.

static SimdFloat gmx_simdcall gmx::abs ( SimdFloat a )

inlinestatic

SIMD float Floating-point abs().

Parameters

a	any floating point values

Returns: abs(a) for each element.

static SimdDouble gmx_simdcall gmx::abs ( SimdDouble a )

inlinestatic

SIMD double floating-point fabs().

Parameters

a	any floating point values

Returns: fabs(a) for each element.

static SimdFloat gmx_simdcall gmx::acos ( SimdFloat x )

inlinestatic

SIMD float acos(x).

Parameters

x	The argument to evaluate acos for

Returns: Acos(x)

static SimdDouble gmx_simdcall gmx::acos ( SimdDouble x )

inlinestatic

SIMD double acos(x).

Parameters

x	The argument to evaluate acos for

Returns: Acos(x)

static SimdDouble gmx_simdcall gmx::acosSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD acos(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate acos for

Returns: Acos(x)

static SimdFloat gmx_simdcall gmx::acosSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float acos(x), only targeting single accuracy.

Parameters

x	The argument to evaluate acos for

Returns: Acos(x)

static Simd4Double gmx_simdcall gmx::andNot	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Bitwise andnot for two SIMD4 double variables. c=(~a) & b.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: (~data1) & data2

static Simd4Float gmx_simdcall gmx::andNot	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Bitwise andnot for two SIMD4 float variables. c=(~a) & b.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: (~data1) & data2

static SimdFloat gmx_simdcall gmx::andNot	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Bitwise andnot for SIMD float.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: (~data1) & data2

static SimdDouble gmx_simdcall gmx::andNot	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Bitwise andnot for SIMD double.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: (~data1) & data2

static SimdFInt32 gmx_simdcall gmx::andNot	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Integer SIMD bitwise not/complement.

Available if GMX_SIMD_HAVE_FINT32_LOGICAL is 1.

Note: You can not use this operation directly to select based on a boolean SIMD variable, since booleans are separate from integer SIMD. If that is what you need, have a look at gmx::selectByMask instead.

Parameters

a	integer SIMD
b	integer SIMD

Returns: (~a) & b

static SimdDInt32 gmx_simdcall gmx::andNot	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Integer SIMD bitwise not/complement.

Available if GMX_SIMD_HAVE_DINT32_LOGICAL is 1.

Note: You can not use this operation directly to select based on a boolean SIMD variable, since booleans are separate from integer SIMD. If that is what you need, have a look at gmx::selectByMask instead.

Parameters

a	integer SIMD
b	integer SIMD

Returns: (~a) & b

static bool gmx_simdcall gmx::anyTrue ( Simd4FBool a )

inlinestatic

Returns non-zero if any of the boolean in SIMD4 a is True, otherwise 0.

Parameters

a	Logical variable.

Returns: true if any element in a is true, otherwise false.

The actual return value for truth will depend on the architecture, so any non-zero value is considered truth.

static bool gmx_simdcall gmx::anyTrue ( Simd4DBool a )

inlinestatic

Returns non-zero if any of the boolean in SIMD4 a is True, otherwise 0.

Parameters

a	Logical variable.

Returns: true if any element in a is true, otherwise false.

The actual return value for truth will depend on the architecture, so any non-zero value is considered truth.

static bool gmx_simdcall gmx::anyTrue ( SimdFBool a )

inlinestatic

Returns non-zero if any of the boolean in SIMD a is True, otherwise 0.

Parameters

a	Logical variable.

Returns: true if any element in a is true, otherwise false.

The actual return value for truth will depend on the architecture, so any non-zero value is considered truth.

static bool gmx_simdcall gmx::anyTrue ( SimdDBool a )

inlinestatic

Returns non-zero if any of the boolean in SIMD a is True, otherwise 0.

Parameters

a	Logical variable.

Returns: true if any element in a is true, otherwise false.

The actual return value for truth will depend on the architecture, so any non-zero value is considered truth.

static bool gmx_simdcall gmx::anyTrue ( SimdFIBool a )

inlinestatic

Returns true if any of the boolean in x is True, otherwise 0.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

The actual return value for "any true" will depend on the architecture. Any non-zero value should be considered truth.

Parameters

a	SIMD boolean

Returns: True if any of the elements in a is true, otherwise 0.

static bool gmx_simdcall gmx::anyTrue ( SimdDIBool a )

inlinestatic

Returns true if any of the boolean in x is True, otherwise 0.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

The actual return value for "any true" will depend on the architecture. Any non-zero value should be considered truth.

Parameters

a	SIMD boolean

Returns: True if any of the elements in a is true, otherwise 0.

static SimdFloat gmx_simdcall gmx::asin ( SimdFloat x )

inlinestatic

SIMD float asin(x).

Parameters

x	The argument to evaluate asin for

Returns: Asin(x)

static SimdDouble gmx_simdcall gmx::asin ( SimdDouble x )

inlinestatic

SIMD double asin(x).

Parameters

x	The argument to evaluate asin for

Returns: Asin(x)

static SimdDouble gmx_simdcall gmx::asinSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD asin(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate asin for

Returns: Asin(x)

static SimdFloat gmx_simdcall gmx::asinSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float asin(x), only targeting single accuracy.

Parameters

x	The argument to evaluate asin for

Returns: Asin(x)

static SimdFloat gmx_simdcall gmx::atan ( SimdFloat x )

inlinestatic

SIMD float asin(x).

Parameters

x	The argument to evaluate atan for

Returns: Atan(x), same argument/value range as standard math library.

static SimdDouble gmx_simdcall gmx::atan ( SimdDouble x )

inlinestatic

SIMD double asin(x).

Parameters

x	The argument to evaluate atan for

Returns: Atan(x), same argument/value range as standard math library.

static SimdFloat gmx_simdcall gmx::atan2	(	SimdFloat	y,
		SimdFloat	x
	)

inlinestatic

SIMD float atan2(y,x).

Parameters

y	Y component of vector, any quartile
x	X component of vector, any quartile

Returns: Atan(y,x), same argument/value range as standard math library.

Note: This routine should provide correct results for all finite non-zero or positive-zero arguments. However, negative zero arguments will be treated as positive zero, which means the return value will deviate from the standard math library atan2(y,x) for those cases. That should not be of any concern in Gromacs, and in particular it will not affect calculations of angles from vectors.

static SimdDouble gmx_simdcall gmx::atan2	(	SimdDouble	y,
		SimdDouble	x
	)

inlinestatic

SIMD double atan2(y,x).

Parameters

y	Y component of vector, any quartile
x	X component of vector, any quartile

Returns: Atan(y,x), same argument/value range as standard math library.

Note: This routine should provide correct results for all finite non-zero or positive-zero arguments. However, negative zero arguments will be treated as positive zero, which means the return value will deviate from the standard math library atan2(y,x) for those cases. That should not be of any concern in Gromacs, and in particular it will not affect calculations of angles from vectors.

static SimdDouble gmx_simdcall gmx::atan2SingleAccuracy	(	SimdDouble	y,
		SimdDouble	x
	)

inlinestatic

SIMD atan2(y,x). Double precision SIMD data, single accuracy.

Parameters

y	Y component of vector, any quartile
x	X component of vector, any quartile

Returns: Atan(y,x), same argument/value range as standard math library.

Note: This routine should provide correct results for all finite non-zero or positive-zero arguments. However, negative zero arguments will be treated as positive zero, which means the return value will deviate from the standard math library atan2(y,x) for those cases. That should not be of any concern in Gromacs, and in particular it will not affect calculations of angles from vectors.

static SimdFloat gmx_simdcall gmx::atan2SingleAccuracy	(	SimdFloat	y,
		SimdFloat	x
	)

inlinestatic

SIMD float atan2(y,x), only targeting single accuracy.

Parameters

y	Y component of vector, any quartile
x	X component of vector, any quartile

Returns: Atan(y,x), same argument/value range as standard math library.

Note: This routine should provide correct results for all finite non-zero or positive-zero arguments. However, negative zero arguments will be treated as positive zero, which means the return value will deviate from the standard math library atan2(y,x) for those cases. That should not be of any concern in Gromacs, and in particular it will not affect calculations of angles from vectors.

static SimdDouble gmx_simdcall gmx::atanSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD asin(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate atan for

Returns: Atan(x), same argument/value range as standard math library.

static SimdFloat gmx_simdcall gmx::atanSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float atan(x), only targeting single accuracy.

Parameters

x	The argument to evaluate atan for

Returns: Atan(x), same argument/value range as standard math library.

static Simd4Float gmx_simdcall gmx::blend	(	Simd4Float	a,
		Simd4Float	b,
		Simd4FBool	sel
	)

inlinestatic

Vector-blend SIMD4 selection.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static Simd4Double gmx_simdcall gmx::blend	(	Simd4Double	a,
		Simd4Double	b,
		Simd4DBool	sel
	)

inlinestatic

Vector-blend SIMD4 selection.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static SimdFloat gmx_simdcall gmx::blend	(	SimdFloat	a,
		SimdFloat	b,
		SimdFBool	sel
	)

inlinestatic

Vector-blend SIMD float selection.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static SimdDouble gmx_simdcall gmx::blend	(	SimdDouble	a,
		SimdDouble	b,
		SimdDBool	sel
	)

inlinestatic

Vector-blend SIMD double selection.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static SimdFInt32 gmx_simdcall gmx::blend	(	SimdFInt32	a,
		SimdFInt32	b,
		SimdFIBool	sel
	)

inlinestatic

Vector-blend SIMD integer selection.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static SimdDInt32 gmx_simdcall gmx::blend	(	SimdDInt32	a,
		SimdDInt32	b,
		SimdDIBool	sel
	)

inlinestatic

Vector-blend SIMD integer selection.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	First source
b	Second source
sel	Boolean selector

Returns: For each element, select b if sel is true, a otherwise.

static SimdFloat gmx_simdcall gmx::cbrt ( SimdFloat x )

inlinestatic

Cube root for SIMD floats.

Parameters

x	Argument to calculate cube root of. Can be negative or zero, but NaN or Inf values are not supported. Denormal values will be treated as 0.0.

Returns: Cube root of x.

static SimdDouble gmx_simdcall gmx::cbrt ( SimdDouble x )

inlinestatic

Cube root for SIMD doubles.

Parameters

x	Argument to calculate cube root of. Can be negative or zero, but NaN or Inf values are not supported. Denormal values will be treated as 0.0.

Returns: Cube root of x.

static SimdDouble gmx_simdcall gmx::cbrtSingleAccuracy ( SimdDouble x )

inlinestatic

Cube root for SIMD doubles, single accuracy.

Parameters

x	Argument to calculate cube root of. Can be negative or zero, but NaN or Inf values are not supported. Denormal values will be treated as 0.0.

Returns: Cube root of x.

static SimdFloat gmx_simdcall gmx::cbrtSingleAccuracy ( SimdFloat x )

inlinestatic

Calculate cbrt(x) for SIMD float, always targeting single accuracy.

Parameters

x	Argument to calculate cube root of. Can be negative or zero, but NaN or Inf values are not supported. Denormal values will be treated as 0.0.

Returns: Cube root of x.

testing::AssertionResult gmx::test::Simd4Test::compareSimd4RealEq	(	const char *	refExpr,
		const char *	tstExpr,
		Simd4Real	ref,
		Simd4Real	tst
	)

Compare two real SIMD4 variables for exact equality.

This is an internal implementation routine. YOu should always use GMX_EXPECT_SIMD4_REAL_NEAR() instead.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro.

The comparison is applied to each element, and it returns true if each element in the SIMD4 test variable is within the class tolerances of the corresponding reference element.

testing::AssertionResult gmx::test::Simd4Test::compareSimd4RealUlp	(	const char *	refExpr,
		const char *	tstExpr,
		Simd4Real	ref,
		Simd4Real	tst
	)

Compare two real SIMD4 variables for approximate equality.

This is an internal implementation routine. YOu should always use GMX_EXPECT_SIMD4_REAL_NEAR() instead.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro.

The comparison is applied to each element, and it returns true if each element in the SIMD4 test variable is within the class tolerances of the corresponding reference element.

testing::AssertionResult gmx::test::SimdTest::compareSimdEq	(	const char *	refExpr,
		const char *	tstExpr,
		SimdReal	ref,
		SimdReal	tst
	)

Compare two real SIMD variables for exact equality.

This is an internal implementation routine. YOu should always use GMX_EXPECT_SIMD_REAL_NEAR() instead.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro.

The comparison is applied to each element, and it returns true if each element in the SIMD test variable is within the class tolerances of the corresponding reference element.

testing::AssertionResult gmx::test::SimdTest::compareSimdEq	(	const char *	refExpr,
		const char *	tstExpr,
		SimdInt32	ref,
		SimdInt32	tst
	)

Compare two 32-bit integer SIMD variables.

This is an internal implementation routine. YOu should always use GMX_EXPECT_SIMD_INT_EQ() instead.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro, while the SIMD and tolerance arguments are used to decide if the values are approximately equal.

The comparison is applied to each element, and it returns true if each element in the SIMD variable tst is identical to the corresponding reference element.

testing::AssertionResult gmx::test::SimdMathTest::compareSimdMathFunction	(	const char *	refFuncExpr,
		const char *	simdFuncExpr,
		const char *	compareSettingsExpr,
		real	refFuncreal x,
		SimdReal gmx_simdcall	simdFuncSimdReal x,
		const CompareSettings &	compareSettings
	)

Implementation routine to compare SIMD vs reference functions.

Parameters

refFuncExpr	Description of reference function expression
simdFuncExpr	Description of SIMD function expression
compareSettingsExpr	Description of compareSettings
refFunc	Reference math function pointer
simdFunc	SIMD math function pointer
compareSettings	Structure with the range, tolerances, and matching rules to use for the comparison.

Note: You should not never call this function directly, but use the macro GMX_EXPECT_SIMD_FUNC_NEAR(refFunc,tstFunc,matchRule) instead.

testing::AssertionResult gmx::test::SimdTest::compareSimdRealUlp	(	const char *	refExpr,
		const char *	tstExpr,
		SimdReal	ref,
		SimdReal	tst
	)

Compare two real SIMD variables for approximate equality.

This is an internal implementation routine. YOu should always use GMX_EXPECT_SIMD_REAL_NEAR() instead.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro.

The comparison is applied to each element, and it returns true if each element in the SIMD test variable is within the class tolerances of the corresponding reference element.

testing::AssertionResult gmx::test::SimdBaseTest::compareVectorRealUlp	(	const char *	refExpr,
		const char *	tstExpr,
		const std::vector< real > &	ref,
		const std::vector< real > &	tst
	)		const

Compare two std::vector<real> for approximate equality.

This is an internal implementation routine that will be used by routines in derived child classes that first convert SIMD or SIMD4 variables to std::vector<real>. Do not call it directly.

This routine is designed according to the Google test specs, so the char strings will describe the arguments to the macro.

The comparison is applied to each element, and it returns true if each element in the vector test variable is within the class tolerances of the corresponding reference elements.

static SimdFloat gmx_simdcall gmx::copysign	(	SimdFloat	x,
		SimdFloat	y
	)

inlinestatic

Composes floating point value with the magnitude of x and the sign of y.

Parameters

x	Values to set sign for
y	Values used to set sign

Returns: Magnitude of x, sign of y

static SimdDouble gmx_simdcall gmx::copysign	(	SimdDouble	x,
		SimdDouble	y
	)

inlinestatic

Composes floating point value with the magnitude of x and the sign of y.

Parameters

x	Values to set sign for
y	Values used to set sign

Returns: Magnitude of x, sign of y

static SimdFloat gmx_simdcall gmx::cos ( SimdFloat x )

inlinestatic

SIMD float cos(x).

Parameters

x	The argument to evaluate cos for

Returns: Cos(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static SimdDouble gmx_simdcall gmx::cos ( SimdDouble x )

inlinestatic

SIMD double cos(x).

Parameters

x	The argument to evaluate cos for

Returns: Cos(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static SimdDouble gmx_simdcall gmx::cosSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD cos(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate cos for

Returns: Cos(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static SimdFloat gmx_simdcall gmx::cosSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float cos(x), only targeting single accuracy.

Parameters

x	The argument to evaluate cos for

Returns: Cos(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static void gmx_simdcall gmx::cprod	(	SimdFloat	ax,
		SimdFloat	ay,
		SimdFloat	az,
		SimdFloat	bx,
		SimdFloat	by,
		SimdFloat	bz,
		SimdFloat *	cx,
		SimdFloat *	cy,
		SimdFloat *	cz
	)

inlinestatic

SIMD float cross-product of multiple vectors.

Parameters

	ax	X components of first vectors
	ay	Y components of first vectors
	az	Z components of first vectors
	bx	X components of second vectors
	by	Y components of second vectors
	bz	Z components of second vectors
[out]	cx	X components of cross product vectors
[out]	cy	Y components of cross product vectors
[out]	cz	Z components of cross product vectors

Returns: void

This calculates C = A x B, where the cross denotes the cross product. The arguments x/y/z denotes the different components, and each element corresponds to a separate vector.

static void gmx_simdcall gmx::cprod	(	SimdDouble	ax,
		SimdDouble	ay,
		SimdDouble	az,
		SimdDouble	bx,
		SimdDouble	by,
		SimdDouble	bz,
		SimdDouble *	cx,
		SimdDouble *	cy,
		SimdDouble *	cz
	)

inlinestatic

SIMD double cross-product of multiple vectors.

Parameters

	ax	X components of first vectors
	ay	Y components of first vectors
	az	Z components of first vectors
	bx	X components of second vectors
	by	Y components of second vectors
	bz	Z components of second vectors
[out]	cx	X components of cross product vectors
[out]	cy	Y components of cross product vectors
[out]	cz	Z components of cross product vectors

Returns: void

This calculates C = A x B, where the cross denotes the cross product. The arguments x/y/z denotes the different components, and each element corresponds to a separate vector.

static SimdFIBool gmx_simdcall gmx::cvtB2IB ( SimdFBool a )

inlinestatic

Convert from single precision boolean to corresponding integer boolean.

Parameters

a	SIMD floating-point boolean

Returns: SIMD integer boolean

static SimdDIBool gmx_simdcall gmx::cvtB2IB ( SimdDBool a )

inlinestatic

Convert from double precision boolean to corresponding integer boolean.

Parameters

a	SIMD floating-point boolean

Returns: SIMD integer boolean

static SimdFloat gmx_simdcall gmx::cvtD2F ( SimdDouble gmx_unused d )

inlinestatic

Convert SIMD double to float.

This version is available if GMX_SIMD_FLOAT_WIDTH is identical to GMX_SIMD_DOUBLE_WIDTH.

Float/double conversions are complex since the SIMD width could either be different (e.g. on x86) or identical (e.g. IBM QPX). This means you will need to check for the width in the code, and have different code paths.

Parameters

d	Double-precision SIMD variable

Returns: Single-precision SIMD variable of the same width

static SimdFloat gmx_simdcall gmx::cvtDD2F	(	SimdDouble gmx_unused	d0,
		SimdDouble gmx_unused	d1
	)

inlinestatic

Convert SIMD double to float.

This version is available if GMX_SIMD_FLOAT_WIDTH is twice as large as GMX_SIMD_DOUBLE_WIDTH.

Float/double conversions are complex since the SIMD width could either be different (e.g. on x86) or identical (e.g. IBM QPX). This means you will need to check for the width in the code, and have different code paths.

Parameters

d0	Double-precision SIMD variable, first half of values to put in f.
d1	Double-precision SIMD variable, second half of values to put in f.

Returns: Single-precision SIMD variable with all values.

static SimdDouble gmx_simdcall gmx::cvtF2D ( SimdFloat gmx_unused f )

inlinestatic

Convert SIMD float to double.

This version is available if GMX_SIMD_FLOAT_WIDTH is identical to GMX_SIMD_DOUBLE_WIDTH.

Float/double conversions are complex since the SIMD width could either be different (e.g. on x86) or identical (e.g. IBM QPX). This means you will need to check for the width in the code, and have different code paths.

Parameters

f	Single-precision SIMD variable

Returns: Double-precision SIMD variable of the same width

static void gmx_simdcall gmx::cvtF2DD	(	SimdFloat gmx_unused	f,
		SimdDouble gmx_unused *	d0,
		SimdDouble gmx_unused *	d1
	)

inlinestatic

Convert SIMD float to double.

This version is available if GMX_SIMD_FLOAT_WIDTH is twice as large as GMX_SIMD_DOUBLE_WIDTH.

Float/double conversions are complex since the SIMD width could either be different (e.g. on x86) or identical (e.g. IBM QPX). This means you will need to check for the width in the code, and have different code paths.

Parameters

	f	Single-precision SIMD variable
[out]	d0	Double-precision SIMD variable, first half of values from f.
[out]	d1	Double-precision SIMD variable, second half of values from f.

static SimdFloat gmx_simdcall gmx::cvtI2R ( SimdFInt32 a )

inlinestatic

Convert integer to single precision floating point.

Parameters

a	SIMD integer

Returns: SIMD floating-point

static SimdDouble gmx_simdcall gmx::cvtI2R ( SimdDInt32 a )

inlinestatic

Convert integer to double precision floating point.

Parameters

a	SIMD integer

Returns: SIMD floating-point

static SimdFBool gmx_simdcall gmx::cvtIB2B ( SimdFIBool a )

inlinestatic

Convert from integer boolean to corresponding single precision boolean.

Parameters

a	SIMD integer boolean

Returns: SIMD floating-point boolean

static SimdDBool gmx_simdcall gmx::cvtIB2B ( SimdDIBool a )

inlinestatic

Convert from integer boolean to corresponding double precision boolean.

Parameters

a	SIMD integer boolean

Returns: SIMD floating-point boolean

static SimdFInt32 gmx_simdcall gmx::cvtR2I ( SimdFloat a )

inlinestatic

Round single precision floating point to integer.

Parameters

a	SIMD floating-point

Returns: SIMD integer, rounded to nearest integer.

Note: Round mode is implementation defined. The only guarantee is that it is consistent between rounding functions (round, cvtR2I).

static SimdDInt32 gmx_simdcall gmx::cvtR2I ( SimdDouble a )

inlinestatic

Round double precision floating point to integer.

Parameters

a	SIMD floating-point

Returns: SIMD integer, rounded to nearest integer.

Note: Round mode is implementation defined. The only guarantee is that it is consistent between rounding functions (round, cvtR2I).

static SimdFInt32 gmx_simdcall gmx::cvttR2I ( SimdFloat a )

inlinestatic

Truncate single precision floating point to integer.

Parameters

a	SIMD floating-point

Returns: SIMD integer, truncated to nearest integer.

static SimdDInt32 gmx_simdcall gmx::cvttR2I ( SimdDouble a )

inlinestatic

Truncate double precision floating point to integer.

Parameters

a	SIMD floating-point

Returns: SIMD integer, truncated to nearest integer.

static void gmx_simdcall gmx::decr3Hsimd	(	double *	m,
		SimdDouble	a0,
		SimdDouble	a1,
		SimdDouble	a2
	)

inlinestatic

Add the two halves of three SIMD doubles, subtract the sum from three half-SIMD-width consecutive doubles in memory.

Parameters

m	half-width aligned memory, from which sum of the halves will be subtracted.
a0	SIMD variable. Upper & lower halves will first be added.
a1	SIMD variable. Upper & lower halves will second be added.
a2	SIMD variable. Upper & lower halves will third be added.

If the SIMD width is 8 and the vectors contain [a0 b0 c0 d0 e0 f0 g0 h0], [a1 b1 c1 d1 e1 f1 g1 g1] and [a2 b2 c2 d2 e2 f2 g2 h2], the memory will be modified to [m[0]-(a0+e0) m[1]-(b0+f0) m[2]-(c0+g0) m[3]-(d0+h0) m[4]-(a1+e1) m[5]-(b1+f1) m[6]-(c1+g1) m[7]-(d1+h1) m[8]-(a2+e2) m[9]-(b2+f2) m[10]-(c2+g2) m[11]-(d2+h2)].

The memory must be aligned to half SIMD width.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static void gmx_simdcall gmx::decr3Hsimd	(	float *	m,
		SimdFloat	a0,
		SimdFloat	a1,
		SimdFloat	a2
	)

inlinestatic

Add the two halves of three SIMD floats, subtract the sum from three half-SIMD-width consecutive floats in memory.

Parameters

m	half-width aligned memory, from which sum of the halves will be subtracted.
a0	SIMD variable. Upper & lower halves will first be added.
a1	SIMD variable. Upper & lower halves will second be added.
a2	SIMD variable. Upper & lower halves will third be added.

If the SIMD width is 8 and the vectors contain [a0 b0 c0 d0 e0 f0 g0 h0], [a1 b1 c1 d1 e1 f1 g1 g1] and [a2 b2 c2 d2 e2 f2 g2 h2], the memory will be modified to [m[0]-(a0+e0) m[1]-(b0+f0) m[2]-(c0+g0) m[3]-(d0+h0) m[4]-(a1+e1) m[5]-(b1+f1) m[6]-(c1+g1) m[7]-(d1+h1) m[8]-(a2+e2) m[9]-(b2+f2) m[10]-(c2+g2) m[11]-(d2+h2)].

The memory must be aligned to half SIMD width.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static float gmx_simdcall gmx::dotProduct	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Return dot product of two single precision SIMD4 variables.

The dot product is calculated between the first three elements in the two vectors, while the fourth is ignored. The result is returned as a scalar.

Parameters

a	vector1
b	vector2

Returns: a[0]*b[0]+a[1]*b[1]+a[2]*b[2], returned as scalar. Last element is ignored.

static double gmx_simdcall gmx::dotProduct	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Return dot product of two double precision SIMD4 variables.

The dot product is calculated between the first three elements in the two vectors, while the fourth is ignored. The result is returned as a scalar.

Parameters

a	vector1
b	vector2

Returns: a[0]*b[0]+a[1]*b[1]+a[2]*b[2], returned as scalar. Last element is ignored.

static SimdFloat gmx_simdcall gmx::erf ( SimdFloat x )

inlinestatic

SIMD float erf(x).

Parameters

x	The value to calculate erf(x) for.

Returns: erf(x)

This routine achieves very close to full precision, but we do not care about the last bit or the subnormal result range.

static SimdDouble gmx_simdcall gmx::erf ( SimdDouble x )

inlinestatic

SIMD double erf(x).

Parameters

x	The value to calculate erf(x) for.

Returns: erf(x)

This routine achieves very close to full precision, but we do not care about the last bit or the subnormal result range.

static SimdFloat gmx_simdcall gmx::erfc ( SimdFloat x )

inlinestatic

SIMD float erfc(x).

Parameters

x	The value to calculate erfc(x) for.

Returns: erfc(x)

This routine achieves full precision (bar the last bit) over most of the input range, but for large arguments where the result is getting close to the minimum representable numbers we accept slightly larger errors (think results that are in the ballpark of 10^-30 for single precision) since that is not relevant for MD.

static SimdDouble gmx_simdcall gmx::erfc ( SimdDouble x )

inlinestatic

SIMD double erfc(x).

Parameters

x	The value to calculate erfc(x) for.

Returns: erfc(x)

This routine achieves full precision (bar the last bit) over most of the input range, but for large arguments where the result is getting close to the minimum representable numbers we accept slightly larger errors (think results that are in the ballpark of 10^-200 for double) since that is not relevant for MD.

static SimdDouble gmx_simdcall gmx::erfcSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD erfc(x). Double precision SIMD data, single accuracy.

Parameters

x	The value to calculate erfc(x) for.

Returns: erfc(x)

This routine achieves singleprecision (bar the last bit) over most of the input range, but for large arguments where the result is getting close to the minimum representable numbers we accept slightly larger errors (think results that are in the ballpark of 10^-30) since that is not relevant for MD.

static SimdFloat gmx_simdcall gmx::erfcSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float erfc(x), only targeting single accuracy.

Parameters

x	The value to calculate erfc(x) for.

Returns: erfc(x)

This routine achieves singleprecision (bar the last bit) over most of the input range, but for large arguments where the result is getting close to the minimum representable numbers we accept slightly larger errors (think results that are in the ballpark of 10^-30) since that is not relevant for MD.

static SimdDouble gmx_simdcall gmx::erfSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD erf(x). Double precision SIMD data, single accuracy.

Parameters

x	The value to calculate erf(x) for.

Returns: erf(x)

This routine achieves very close to single precision, but we do not care about the last bit or the subnormal result range.

static SimdFloat gmx_simdcall gmx::erfSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float erf(x), only targeting single accuracy.

Parameters

x	The value to calculate erf(x) for.

Returns: erf(x)

This routine achieves very close to single precision, but we do not care about the last bit or the subnormal result range.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::exp ( SimdFloat x )

inlinestatic

SIMD float exp(x).

In addition to scaling the argument for 2^x this routine correctly does extended precision arithmetics to improve accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to input arguments reaching -126*ln(2)=-87.3 in single, or -1022*ln(2)=-708.4 (double). Similarly, it might overflow for arguments reaching 127*ln(2)=88.0 (single) or 1023*ln(2)=709.1 (double). If the unsafe math optimizations are enabled, small input values that would result in zero-clamped output are not allowed.

Returns: exp(x). Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::exp ( SimdDouble x )

inlinestatic

SIMD double exp(x).

In addition to scaling the argument for 2^x this routine correctly does extended precision arithmetics to improve accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to input arguments reaching -126*ln(2)=-87.3 in single, or -1022*ln(2)=-708.4 (double). Similarly, it might overflow for arguments reaching 127*ln(2)=88.0 (single) or 1023*ln(2)=709.1 (double). If the unsafe math optimizations are enabled, small input values that would result in zero-clamped output are not allowed.

Returns: exp(x). Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::exp2 ( SimdFloat x )

inlinestatic

SIMD float 2^x.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to inputs below -126 in single or -1022 in double, and it might overflow for arguments reaching 127 (single) or 1023 (double). If you enable the unsafe math optimization, very small arguments will not necessarily be zero-clamped, but can produce undefined results.

Returns: 2^x. The result is undefined for very large arguments that cause internal floating-point overflow. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::exp2 ( SimdDouble x )

inlinestatic

SIMD double 2^x.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to inputs below -126 in single or -1022 in double, and it might overflow for arguments reaching 127 (single) or 1023 (double). If you enable the unsafe math optimization, very small arguments will not necessarily be zero-clamped, but can produce undefined results.

Returns: 2^x. The result is undefined for very large arguments that cause internal floating-point overflow. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::exp2SingleAccuracy ( SimdDouble x )

inlinestatic

SIMD 2^x. Double precision SIMD, single accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to inputs below -126 in single or -1022 in double, and it might overflow for arguments reaching 127 (single) or 1023 (double). If you enable the unsafe math optimization, very small arguments will not necessarily be zero-clamped, but can produce undefined results.

Returns: 2^x. The result is undefined for very large arguments that cause internal floating-point overflow. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::exp2SingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float 2^x, only targeting single accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to inputs below -126 in single or -1022 in double, and it might overflow for arguments reaching 127 (single) or 1023 (double). If you enable the unsafe math optimization, very small arguments will not necessarily be zero-clamped, but can produce undefined results.

Returns: 2^x. The result is undefined for very large arguments that cause internal floating-point overflow. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

static void gmx_simdcall gmx::expandScalarsToTriplets	(	SimdDouble	scalar,
		SimdDouble *	triplets0,
		SimdDouble *	triplets1,
		SimdDouble *	triplets2
	)

inlinestatic

Expand each element of double SIMD variable into three identical consecutive elements in three SIMD outputs.

Parameters

	scalar	Floating-point input, e.g. [s0 s1 s2 s3] if width=4.
[out]	triplets0	First output, e.g. [s0 s0 s0 s1] if width=4.
[out]	triplets1	Second output, e.g. [s1 s1 s2 s2] if width=4.
[out]	triplets2	Third output, e.g. [s2 s3 s3 s3] if width=4.

This routine is meant to use for things like scalar-vector multiplication, where the vectors are stored in a merged format like [x0 y0 z0 x1 y1 z1 ...], while the scalars are stored as [s0 s1 s2...], and the data cannot easily be changed to SIMD-friendly layout.

In this case, load 3 full-width SIMD variables from the vector array (This will always correspond to GMX_SIMD_DOUBLE_WIDTH triplets), load a single full-width variable from the scalar array, and call this routine to expand the data. You can then simply multiply the first, second and third pair of SIMD variables, and store the three results back into a suitable vector-format array.

static void gmx_simdcall gmx::expandScalarsToTriplets	(	SimdFloat	scalar,
		SimdFloat *	triplets0,
		SimdFloat *	triplets1,
		SimdFloat *	triplets2
	)

inlinestatic

Expand each element of float SIMD variable into three identical consecutive elements in three SIMD outputs.

Parameters

	scalar	Floating-point input, e.g. [s0 s1 s2 s3] if width=4.
[out]	triplets0	First output, e.g. [s0 s0 s0 s1] if width=4.
[out]	triplets1	Second output, e.g. [s1 s1 s2 s2] if width=4.
[out]	triplets2	Third output, e.g. [s2 s3 s3 s3] if width=4.

This routine is meant to use for things like scalar-vector multiplication, where the vectors are stored in a merged format like [x0 y0 z0 x1 y1 z1 ...], while the scalars are stored as [s0 s1 s2...], and the data cannot easily be changed to SIMD-friendly layout.

In this case, load 3 full-width SIMD variables from the vector array (This will always correspond to GMX_SIMD_FLOAT_WIDTH triplets), load a single full-width variable from the scalar array, and call this routine to expand the data. You can then simply multiply the first, second and third pair of SIMD variables, and store the three results back into a suitable vector-format array.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::expSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD exp(x). Double precision SIMD, single accuracy.

In addition to scaling the argument for 2^x this routine correctly does extended precision arithmetics to improve accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to input arguments reaching -126*ln(2)=-87.3 in single, or -1022*ln(2)=-708.4 (double). Similarly, it might overflow for arguments reaching 127*ln(2)=88.0 (single) or 1023*ln(2)=709.1 (double). If the unsafe math optimizations are enabled, small input values that would result in zero-clamped output are not allowed.

Returns: exp(x). Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::expSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float e^x, only targeting single accuracy.

In addition to scaling the argument for 2^x this routine correctly does extended precision arithmetics to improve accuracy.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, input values that would otherwise lead to zero-clamped results are not allowed and will lead to undefined results.

Parameters

x Argument. For the default (safe) function version this can be arbitrarily small value, but the routine might clamp the result to zero for arguments that would produce subnormal IEEE754-2008 results. This corresponds to input arguments reaching -126*ln(2)=-87.3 in single, or -1022*ln(2)=-708.4 (double). Similarly, it might overflow for arguments reaching 127*ln(2)=88.0 (single) or 1023*ln(2)=709.1 (double). If the unsafe math optimizations are enabled, small input values that would result in zero-clamped output are not allowed.

Returns: exp(x). Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for very small values.

Note: The definition range of this function is just-so-slightly smaller than the allowed IEEE exponents for many architectures. This is due to the implementation, which will hopefully improve in the future.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<int index>

static std::int32_t gmx_simdcall gmx::extract ( SimdFInt32 a )

inlinestatic

Extract element with index i from gmx::SimdFInt32.

Available if GMX_SIMD_HAVE_FINT32_EXTRACT is 1.

Template Parameters

index Compile-time constant, position to extract (first position is 0)

Parameters

a	SIMD variable from which to extract value.

Returns: Single integer from position index in SIMD variable.

template<int index>

static std::int32_t gmx_simdcall gmx::extract ( SimdDInt32 a )

inlinestatic

Extract element with index i from gmx::SimdDInt32.

Available if GMX_SIMD_HAVE_DINT32_EXTRACT is 1.

Template Parameters

index Compile-time constant, position to extract (first position is 0)

Parameters

a	SIMD variable from which to extract value.

Returns: Single integer from position index in SIMD variable.

static Simd4Float gmx_simdcall gmx::fma	(	Simd4Float	a,
		Simd4Float	b,
		Simd4Float	c
	)

inlinestatic

SIMD4 Fused-multiply-add. Result is a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b+c

static Simd4Double gmx_simdcall gmx::fma	(	Simd4Double	a,
		Simd4Double	b,
		Simd4Double	c
	)

inlinestatic

SIMD4 Fused-multiply-add. Result is a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b+c

static SimdFloat gmx_simdcall gmx::fma	(	SimdFloat	a,
		SimdFloat	b,
		SimdFloat	c
	)

inlinestatic

SIMD float Fused-multiply-add. Result is a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b+c

static SimdDouble gmx_simdcall gmx::fma	(	SimdDouble	a,
		SimdDouble	b,
		SimdDouble	c
	)

inlinestatic

SIMD double Fused-multiply-add. Result is a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b+c

static Simd4Float gmx_simdcall gmx::fms	(	Simd4Float	a,
		Simd4Float	b,
		Simd4Float	c
	)

inlinestatic

SIMD4 Fused-multiply-subtract. Result is a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b-c

static Simd4Double gmx_simdcall gmx::fms	(	Simd4Double	a,
		Simd4Double	b,
		Simd4Double	c
	)

inlinestatic

SIMD4 Fused-multiply-subtract. Result is a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b-c

static SimdFloat gmx_simdcall gmx::fms	(	SimdFloat	a,
		SimdFloat	b,
		SimdFloat	c
	)

inlinestatic

SIMD float Fused-multiply-subtract. Result is a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b-c

static SimdDouble gmx_simdcall gmx::fms	(	SimdDouble	a,
		SimdDouble	b,
		SimdDouble	c
	)

inlinestatic

SIMD double Fused-multiply-subtract. Result is a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: a*b-c

static Simd4Double gmx_simdcall gmx::fnma	(	Simd4Double	a,
		Simd4Double	b,
		Simd4Double	c
	)

inlinestatic

SIMD4 Fused-negated-multiply-add. Result is -a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b+c

static Simd4Float gmx_simdcall gmx::fnma	(	Simd4Float	a,
		Simd4Float	b,
		Simd4Float	c
	)

inlinestatic

SIMD4 Fused-negated-multiply-add. Result is -a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b+c

static SimdFloat gmx_simdcall gmx::fnma	(	SimdFloat	a,
		SimdFloat	b,
		SimdFloat	c
	)

inlinestatic

SIMD float Fused-negated-multiply-add. Result is -a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b+c

static SimdDouble gmx_simdcall gmx::fnma	(	SimdDouble	a,
		SimdDouble	b,
		SimdDouble	c
	)

inlinestatic

SIMD double Fused-negated-multiply-add. Result is -a*b+c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b+c

static Simd4Double gmx_simdcall gmx::fnms	(	Simd4Double	a,
		Simd4Double	b,
		Simd4Double	c
	)

inlinestatic

SIMD4 Fused-negated-multiply-subtract. Result is -a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b-c

static Simd4Float gmx_simdcall gmx::fnms	(	Simd4Float	a,
		Simd4Float	b,
		Simd4Float	c
	)

inlinestatic

SIMD4 Fused-negated-multiply-subtract. Result is -a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b-c

static SimdFloat gmx_simdcall gmx::fnms	(	SimdFloat	a,
		SimdFloat	b,
		SimdFloat	c
	)

inlinestatic

SIMD float Fused-negated-multiply-subtract. Result is -a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b-c

static SimdDouble gmx_simdcall gmx::fnms	(	SimdDouble	a,
		SimdDouble	b,
		SimdDouble	c
	)

inlinestatic

SIMD double Fused-negated-multiply-subtract. Result is -a*b-c.

Parameters

a	factor1
b	factor2
c	term

Returns: -a*b-c

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::frexp	(	SimdFloat	value,
		SimdFInt32 *	exponent
	)

inlinestatic

Extract (integer) exponent and fraction from single precision SIMD.

Template Parameters

opt	By default this function behaves like the standard library such that frexp(+-0,exp) returns +-0 and stores 0 in the exponent when value is 0. If you know the argument is always nonzero, you can set the template parameter to MathOptimization::Unsafe to make it slightly faster.

Parameters

	value	Floating-point value to extract from
[out]	exponent	Returned exponent of value, integer SIMD format.

Returns: Fraction of value, floating-point SIMD format.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::frexp	(	SimdDouble	value,
		SimdDInt32 *	exponent
	)

inlinestatic

Extract (integer) exponent and fraction from double precision SIMD.

Template Parameters

opt	By default this function behaves like the standard library such that frexp(+-0,exp) returns +-0 and stores 0 in the exponent when value is 0. If you know the argument is always nonzero, you can set the template parameter to MathOptimization::Unsafe to make it slightly faster.

Parameters

	value	Floating-point value to extract from
[out]	exponent	Returned exponent of value, integer SIMD format.

Returns: Fraction of value, floating-point SIMD format.

template<int align>

static void gmx_simdcall gmx::gatherLoadBySimdIntTranspose	(	const double *	base,
		SimdDInt32	offset,
		SimdDouble *	v0,
		SimdDouble *	v1,
		SimdDouble *	v2,
		SimdDouble *	v3
	)

inlinestatic

Load 4 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets specified by a SIMD integer, transpose into 4 SIMD double variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 4 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Aligned pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.
[out]	v2	Third component, base[align*offset[i] + 2] for each i.
[out]	v3	Fourth component, base[align*offset[i] + 3] for each i.

The floating-point memory locations must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadBySimdIntTranspose	(	const float *	base,
		SimdFInt32	offset,
		SimdFloat *	v0,
		SimdFloat *	v1,
		SimdFloat *	v2,
		SimdFloat *	v3
	)

inlinestatic

Load 4 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets specified by a SIMD integer, transpose into 4 SIMD float variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 4 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Aligned pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.
[out]	v2	Third component, base[align*offset[i] + 2] for each i.
[out]	v3	Fourth component, base[align*offset[i] + 3] for each i.

The floating-point memory locations must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadBySimdIntTranspose	(	const double *	base,
		SimdDInt32	offset,
		SimdDouble *	v0,
		SimdDouble *	v1
	)

inlinestatic

Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets specified by a SIMD integer, transpose into 2 SIMD double variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Aligned pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadBySimdIntTranspose	(	const float *	base,
		SimdFInt32	offset,
		SimdFloat *	v0,
		SimdFloat *	v1
	)

inlinestatic

Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets specified by a SIMD integer, transpose into 2 SIMD float variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Aligned pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadTranspose	(	const double *	base,
		const std::int32_t	offset[],
		SimdDouble *	v0,
		SimdDouble *	v1,
		SimdDouble *	v2,
		SimdDouble *	v3
	)

inlinestatic

Load 4 consecutive double from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 4 SIMD double variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 4 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.
[out]	v2	3rd component of data, base[align*offset[i] + 2] for each i.
[out]	v3	4th component of data, base[align*offset[i] + 3] for each i.

The floating-point memory locations must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

The offset memory must be aligned to GMX_SIMD_DINT32_WIDTH.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.

template<int align>

static void gmx_simdcall gmx::gatherLoadTranspose	(	const float *	base,
		const std::int32_t	offset[],
		SimdFloat *	v0,
		SimdFloat *	v1,
		SimdFloat *	v2,
		SimdFloat *	v3
	)

inlinestatic

Load 4 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 4 SIMD float variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 4 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.
[out]	v2	3rd component of data, base[align*offset[i] + 2] for each i.
[out]	v3	4th component of data, base[align*offset[i] + 3] for each i.

The floating-point memory locations must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

The offset memory must be aligned to GMX_SIMD_DINT32_WIDTH.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.

template<int align>

static void gmx_simdcall gmx::gatherLoadTranspose	(	const double *	base,
		const std::int32_t	offset[],
		SimdDouble *	v0,
		SimdDouble *	v1
	)

inlinestatic

Load 2 consecutive double from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 2 SIMD double variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

The offset memory must be aligned to GMX_SIMD_DINT32_WIDTH.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.

template<int align>

static void gmx_simdcall gmx::gatherLoadTranspose	(	const float *	base,
		const std::int32_t	offset[],
		SimdFloat *	v0,
		SimdFloat *	v1
	)

inlinestatic

Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 2 SIMD float variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

The offset memory must be aligned to GMX_SIMD_FINT32_WIDTH.

To achieve the best possible performance, you should store your data with alignment c_simdBestPairAlignmentFloat in single, or c_simdBestPairAlignmentDouble in double.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.

template<int align>

static void gmx_simdcall gmx::gatherLoadTransposeHsimd	(	const double *	base0,
		const double *	base1,
		std::int32_t	offset[],
		SimdDouble *	v0,
		SimdDouble *	v1
	)

inlinestatic

Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH/2 offsets, transpose into SIMD double (low half from base0, high from base1).

Template Parameters

align Alignment of the storage, i.e. the distance (measured in elements, not bytes) between index points. When this is identical to the number of output components the data is packed without padding. This must be a multiple of the alignment to keep all data aligned.

Parameters

	base0	Pointer to base of first aligned memory
	base1	Pointer to base of second aligned memory
	offset	Offset to the start of each pair
[out]	v0	1st element in each pair, base0 in low and base1 in high half.
[out]	v1	2nd element in each pair, base0 in low and base1 in high half.

The offset array should be of half the SIMD width length, so it corresponds to the half-SIMD-register operations. This also means it must be aligned to half the integer SIMD width (i.e., GMX_SIMD_DINT32_WIDTH/2).

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

This routine is primarily designed to load nonbonded parameters in the kernels. It is the equivalent of the full-width routine gatherLoadTranspose(), but just as the other hsimd routines it will pick half-SIMD-width data from base0 and put in the lower half, while the upper half comes from base1.

For an example, assume the SIMD width is 8, align is 2, that base0 is [A0 A1 B0 B1 C0 C1 D0 D1 ...], and base1 [E0 E1 F0 F1 G0 G1 H0 H1...].

Then we will get v0 as [A0 B0 C0 D0 E0 F0 G0 H0] and v1 as [A1 B1 C1 D1 E1 F1 G1 H1].

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

template<int align>

static void gmx_simdcall gmx::gatherLoadTransposeHsimd	(	const float *	base0,
		const float *	base1,
		const std::int32_t	offset[],
		SimdFloat *	v0,
		SimdFloat *	v1
	)

inlinestatic

Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH/2 offsets, transpose into SIMD float (low half from base0, high from base1).

Template Parameters

align Alignment of the storage, i.e. the distance (measured in elements, not bytes) between index points. When this is identical to the number of output components the data is packed without padding. This must be a multiple of the alignment to keep all data aligned.

Parameters

	base0	Pointer to base of first aligned memory
	base1	Pointer to base of second aligned memory
	offset	Offset to the start of each pair
[out]	v0	1st element in each pair, base0 in low and base1 in high half.
[out]	v1	2nd element in each pair, base0 in low and base1 in high half.

The offset array should be of half the SIMD width length, so it corresponds to the half-SIMD-register operations. This also means it must be aligned to half the integer SIMD width (i.e., GMX_SIMD_FINT32_WIDTH/2).

The floating-point memory locations must be aligned, but only to the smaller of two elements and the floating-point SIMD width.

This routine is primarily designed to load nonbonded parameters in the kernels. It is the equivalent of the full-width routine gatherLoadTranspose(), but just as the other hsimd routines it will pick half-SIMD-width data from base0 and put in the lower half, while the upper half comes from base1.

For an example, assume the SIMD width is 8, align is 2, that base0 is [A0 A1 B0 B1 C0 C1 D0 D1 ...], and base1 [E0 E1 F0 F1 G0 G1 H0 H1...].

Then we will get v0 as [A0 B0 C0 D0 E0 F0 G0 H0] and v1 as [A1 B1 C1 D1 E1 F1 G1 H1].

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

template<int align>

static void gmx_simdcall gmx::gatherLoadUBySimdIntTranspose	(	const double *	base,
		SimdDInt32	offset,
		SimdDouble *	v0,
		SimdDouble *	v1
	)

inlinestatic

Load 2 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets (unaligned) specified by SIMD integer, transpose into 2 SIMD doubles.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.

Since some SIMD architectures cannot handle any unaligned loads, this routine is only available if GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE is 1.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadUBySimdIntTranspose	(	const float *	base,
		SimdFInt32	offset,
		SimdFloat *	v0,
		SimdFloat *	v1
	)

inlinestatic

Load 2 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets (unaligned) specified by SIMD integer, transpose into 2 SIMD floats.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 2 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory.
	offset	SIMD integer type with offsets to the start of each triplet.
[out]	v0	First component, base[align*offset[i]] for each i.
[out]	v1	Second component, base[align*offset[i] + 1] for each i.

Since some SIMD architectures cannot handle any unaligned loads, this routine is only available if GMX_SIMD_HAVE_GATHER_LOADU_BYSIMDINT_TRANSPOSE is 1.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This is a special routine primarily intended for loading Gromacs table data as efficiently as possible - this is the reason for using a SIMD offset index, since the result of the real-to-integer conversion is present in a SIMD register just before calling this routine.

template<int align>

static void gmx_simdcall gmx::gatherLoadUTranspose	(	const double *	base,
		const std::int32_t	offset[],
		SimdDouble *	v0,
		SimdDouble *	v1,
		SimdDouble *	v2
	)

inlinestatic

Load 3 consecutive doubles from each of GMX_SIMD_DOUBLE_WIDTH offsets, and transpose into 3 SIMD double variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.
[out]	v2	3rd component of data, base[align*offset[i] + 2] for each i.

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load this data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned loads. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

template<int align>

static void gmx_simdcall gmx::gatherLoadUTranspose	(	const float *	base,
		const std::int32_t	offset[],
		SimdFloat *	v0,
		SimdFloat *	v1,
		SimdFloat *	v2
	)

inlinestatic

Load 3 consecutive floats from each of GMX_SIMD_FLOAT_WIDTH offsets, and transpose into 3 SIMD float variables.

Template Parameters

align Alignment of the memory from which we read, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the input data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are loaded.

Parameters

	base	Pointer to the start of the memory area
	offset	Array with offsets to the start of each data point.
[out]	v0	1st component of data, base[align*offset[i]] for each i.
[out]	v1	2nd component of data, base[align*offset[i] + 1] for each i.
[out]	v2	3rd component of data, base[align*offset[i] + 2] for each i.

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load this data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned loads. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

std::vector< real > gmx::test::SimdMathTest::generateTestPoints	(	Range	range,
		std::size_t	points
	)

static

Generate test point vector.

Parameters

range	The test interval, half open. Upper limit is not included. Pass by value, since we need to modify in method anyway.
points	Number of points to generate. This might be increased slightly to account both for extra special values like 0.0 and the SIMD width.

This routine generates a vector with test points separated by constant multiplicative factors, based on the range and number of points in the class. If the range includes both negative and positive values, points will be generated separately for the negative/positive intervals down to the smallest real number that can be represented, and we also include 0.0 explicitly.

This is highly useful for large test ranges. For example, with a linear 1000-point division of the range (1,1e10) the first three values to test would be 1, 10000000.999, and 20000000.998, etc. For large values we would commonly hit the point where adding the small delta has no effect due to limited numerical precision. When we instead use this routine, the values will be 1, 1.0239, 1.0471, etc. This will spread the entropy over all bits in the IEEE754 representation, and be a much better test of all potential input values.

Note: We do not use the static variable s_nPoints in the parent class to avoid altering any value the user has set on the command line; since it's a static member, changing it would have permanent effect.

static void gmx_simdcall gmx::incrDualHsimd	(	double *	m0,
		double *	m1,
		SimdDouble	a
	)

inlinestatic

Add each half of SIMD variable to separate memory adresses.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.
a	SIMD variable. Lower half will be added to m0, upper half to m1.

The memory must be aligned to half SIMD width.

Note: The updated m0 value is written before m1 is read from memory, so the result will be correct even if the memory regions overlap.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static void gmx_simdcall gmx::incrDualHsimd	(	float *	m0,
		float *	m1,
		SimdFloat	a
	)

inlinestatic

Add each half of SIMD variable to separate memory adresses.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.
a	SIMD variable. Lower half will be added to m0, upper half to m1.

The memory must be aligned to half SIMD width.

Note: The updated m0 value is written before m1 is read from memory, so the result will be correct even if the memory regions overlap.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static SimdFloat gmx_simdcall gmx::inv ( SimdFloat x )

inlinestatic

Calculate 1/x for SIMD float.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/x. Result is undefined if your argument was invalid.

static SimdDouble gmx_simdcall gmx::inv ( SimdDouble x )

inlinestatic

Calculate 1/x for SIMD double.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/x. Result is undefined if your argument was invalid.

static SimdFloat gmx_simdcall gmx::invcbrt ( SimdFloat x )

inlinestatic

Inverse cube root for SIMD floats.

Parameters

x	Argument to calculate cube root of. Can be positive or negative, but the magnitude cannot be lower than the smallest normal number.

Returns: Cube root of x. Undefined for values that don't fulfill the restriction of abs(x) > minFloat.

static SimdDouble gmx_simdcall gmx::invcbrt ( SimdDouble x )

inlinestatic

Inverse cube root for SIMD doubles.

Parameters

x	Argument to calculate cube root of. Can be positive or negative, but the magnitude cannot be lower than the smallest normal number.

Returns: Cube root of x. Undefined for values that don't fulfill the restriction of abs(x) > minDouble.

static SimdDouble gmx_simdcall gmx::invcbrtSingleAccuracy ( SimdDouble x )

inlinestatic

Inverse cube root for SIMD doubles, single accuracy.

Parameters

x	Argument to calculate cube root of. Can be positive or negative, but the magnitude cannot be lower than the smallest normal number.

Returns: Cube root of x. Undefined for values that don't fulfill the restriction of abs(x) > minDouble.

static SimdFloat gmx_simdcall gmx::invcbrtSingleAccuracy ( SimdFloat x )

inlinestatic

Calculate 1/cbrt(x) for SIMD float, always targeting single accuracy.

Parameters

x	Argument to calculate cube root of. Can be negative or zero, but NaN or Inf values are not supported. Denormal values will be treated as 0.0.

Returns: Cube root of x.

static SimdDouble gmx_simdcall gmx::invSingleAccuracy ( SimdDouble x )

inlinestatic

Calculate 1/x for SIMD double, but in single accuracy.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/x. Result is undefined if your argument was invalid.

static SimdFloat gmx_simdcall gmx::invSingleAccuracy ( SimdFloat x )

inlinestatic

Calculate 1/x for SIMD float, only targeting single accuracy.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/x. Result is undefined if your argument was invalid.

static SimdFloat gmx_simdcall gmx::invsqrt ( SimdFloat x )

inlinestatic

Calculate 1/sqrt(x) for SIMD float.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static SimdDouble gmx_simdcall gmx::invsqrt ( SimdDouble x )

inlinestatic

Calculate 1/sqrt(x) for SIMD double.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static Simd4Float gmx_simdcall gmx::invsqrt ( Simd4Float x )

inlinestatic

Calculate 1/sqrt(x) for SIMD4 float.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static Simd4Double gmx_simdcall gmx::invsqrt ( Simd4Double x )

inlinestatic

Calculate 1/sqrt(x) for SIMD4 double.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static void gmx_simdcall gmx::invsqrtPair	(	SimdFloat	x0,
		SimdFloat	x1,
		SimdFloat *	out0,
		SimdFloat *	out1
	)

inlinestatic

Calculate 1/sqrt(x) for two SIMD floats.

Parameters

	x0	First set of arguments, x0 must be in single range (see below).
	x1	Second set of arguments, x1 must be in single range (see below).
[out]	out0	Result 1/sqrt(x0)
[out]	out1	Result 1/sqrt(x1)

In particular for double precision we can sometimes calculate square root pairs slightly faster by using single precision until the very last step.

Note: Both arguments must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

static void gmx_simdcall gmx::invsqrtPair	(	SimdDouble	x0,
		SimdDouble	x1,
		SimdDouble *	out0,
		SimdDouble *	out1
	)

inlinestatic

Calculate 1/sqrt(x) for two SIMD doubles.

Parameters

	x0	First set of arguments, x0 must be in single range (see below).
	x1	Second set of arguments, x1 must be in single range (see below).
[out]	out0	Result 1/sqrt(x0)
[out]	out1	Result 1/sqrt(x1)

In particular for double precision we can sometimes calculate square root pairs slightly faster by using single precision until the very last step.

Note: Both arguments must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

static void gmx_simdcall gmx::invsqrtPairSingleAccuracy	(	SimdDouble	x0,
		SimdDouble	x1,
		SimdDouble *	out0,
		SimdDouble *	out1
	)

inlinestatic

Calculate 1/sqrt(x) for two SIMD doubles, but single accuracy.

Parameters

	x0	First set of arguments, x0 must be in single range (see below).
	x1	Second set of arguments, x1 must be in single range (see below).
[out]	out0	Result 1/sqrt(x0)
[out]	out1	Result 1/sqrt(x1)

In particular for double precision we can sometimes calculate square root pairs slightly faster by using single precision until the very last step.

Note: Both arguments must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

static void gmx_simdcall gmx::invsqrtPairSingleAccuracy	(	SimdFloat	x0,
		SimdFloat	x1,
		SimdFloat *	out0,
		SimdFloat *	out1
	)

inlinestatic

Calculate 1/sqrt(x) for two SIMD floats, only targeting single accuracy.

Parameters

	x0	First set of arguments, x0 must be in single range (see below).
	x1	Second set of arguments, x1 must be in single range (see below).
[out]	out0	Result 1/sqrt(x0)
[out]	out1	Result 1/sqrt(x1)

In particular for double precision we can sometimes calculate square root pairs slightly faster by using single precision until the very last step.

Note: Both arguments must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

static SimdDouble gmx_simdcall gmx::invsqrtSingleAccuracy ( SimdDouble x )

inlinestatic

Calculate 1/sqrt(x) for SIMD double, but in single accuracy.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static Simd4Double gmx_simdcall gmx::invsqrtSingleAccuracy ( Simd4Double x )

inlinestatic

Calculate 1/sqrt(x) for SIMD4 double, but in single accuracy.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static SimdFloat gmx_simdcall gmx::invsqrtSingleAccuracy ( SimdFloat x )

inlinestatic

Calculate 1/sqrt(x) for SIMD float, only targeting single accuracy.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static Simd4Float gmx_simdcall gmx::invsqrtSingleAccuracy ( Simd4Float x )

inlinestatic

Calculate 1/sqrt(x) for SIMD4 float, only targeting single accuracy.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: 1/sqrt(x). Result is undefined if your argument was invalid.

static SimdFloat gmx_simdcall gmx::iprod	(	SimdFloat	ax,
		SimdFloat	ay,
		SimdFloat	az,
		SimdFloat	bx,
		SimdFloat	by,
		SimdFloat	bz
	)

inlinestatic

SIMD float inner product of multiple float vectors.

Parameters

ax	X components of first vectors
ay	Y components of first vectors
az	Z components of first vectors
bx	X components of second vectors
by	Y components of second vectors
bz	Z components of second vectors

Returns: Element i will be res[i] = ax[i]*bx[i]+ay[i]*by[i]+az[i]*bz[i].

Note: The SIMD part is that we calculate many scalar products in one call.

static SimdDouble gmx_simdcall gmx::iprod	(	SimdDouble	ax,
		SimdDouble	ay,
		SimdDouble	az,
		SimdDouble	bx,
		SimdDouble	by,
		SimdDouble	bz
	)

inlinestatic

SIMD double inner product of multiple double vectors.

Parameters

ax	X components of first vectors
ay	Y components of first vectors
az	Z components of first vectors
bx	X components of second vectors
by	Y components of second vectors
bz	Z components of second vectors

Returns: Element i will be res[i] = ax[i]*bx[i]+ay[i]*by[i]+az[i]*bz[i].

Note: The SIMD part is that we calculate many scalar products in one call.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::ldexp	(	SimdFloat	value,
		SimdFInt32	exponent
	)

inlinestatic

Multiply a SIMD float value by the number 2 raised to an exp power.

Template Parameters

opt By default, this routine will return zero for input arguments that are so small they cannot be reproduced in the current precision. If the unsafe math optimization template parameter setting is used, these tests are skipped, and the result will be undefined (possible even NaN). This might happen below -127 in single precision or -1023 in double, although some might use denormal support to extend the range.

Parameters

value	Floating-point number to multiply with new exponent
exponent	Integer that will not overflow as 2^exponent.

Returns: value*2^exponent

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::ldexp	(	SimdDouble	value,
		SimdDInt32	exponent
	)

inlinestatic

Multiply a SIMD double value by the number 2 raised to an exp power.

Template Parameters

opt By default, this routine will return zero for input arguments that are so small they cannot be reproduced in the current precision. If the unsafe math optimization template parameter setting is used, these tests are skipped, and the result will be undefined (possible even NaN). This might happen below -127 in single precision or -1023 in double, although some might use denormal support to extend the range.

Parameters

value	Floating-point number to multiply with new exponent
exponent	Integer that will not overflow as 2^exponent.

Returns: value*2^exponent

static Simd4Float gmx_simdcall gmx::load4 ( const float * m )

inlinestatic

Load 4 float values from aligned memory into SIMD4 variable.

Parameters

m	Pointer to memory aligned to 4 elements.

Returns: SIMD4 variable with data loaded.

static Simd4Double gmx_simdcall gmx::load4 ( const double * m )

inlinestatic

Load 4 double values from aligned memory into SIMD4 variable.

Parameters

m	Pointer to memory aligned to 4 elements.

Returns: SIMD4 variable with data loaded.

static SimdDouble gmx_simdcall gmx::load4DuplicateN ( const double * m )

inlinestatic

Load 4 doubles and duplicate them N times each.

Parameters

m	Pointer to memory aligned to 4 doubles

Returns: SIMD variable with 4 doubles from m duplicated Nx.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_DOUBLE is 1. N is GMX_SIMD_DOUBLE_WIDTH/4. Different values are contigous and same values are 4 positions in SIMD apart.

static SimdFloat gmx_simdcall gmx::load4DuplicateN ( const float * m )

inlinestatic

Load 4 floats and duplicate them N times each.

Parameters

m	Pointer to memory aligned to 4 floats

Returns: SIMD variable with 4 floats from m duplicated Nx.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_FLOAT is 1. N is GMX_SIMD_FLOAT_WIDTH/4. Different values are contigous and same values are 4 positions in SIMD apart.

static Simd4Float gmx_simdcall gmx::load4U ( const float * m )

inlinestatic

Load SIMD4 float from unaligned memory.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirement.

Returns: SIMD4 variable with data loaded.

static Simd4Double gmx_simdcall gmx::load4U ( const double * m )

inlinestatic

Load SIMD4 double from unaligned memory.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirement.

Returns: SIMD4 variable with data loaded.

static SimdDouble gmx_simdcall gmx::loadDualHsimd	(	const double *	m0,
		const double *	m1
	)

inlinestatic

Load low & high parts of SIMD double from different locations.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.

Returns: SIMD variable with low part loaded from m0, high from m1.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static SimdFloat gmx_simdcall gmx::loadDualHsimd	(	const float *	m0,
		const float *	m1
	)

inlinestatic

Load low & high parts of SIMD float from different locations.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.

Returns: SIMD variable with low part loaded from m0, high from m1.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static SimdDouble gmx_simdcall gmx::loadDuplicateHsimd ( const double * m )

inlinestatic

Load half-SIMD-width double data, spread to both halves.

Parameters

m	Pointer to memory aligned to half SIMD width.

Returns: SIMD variable with both halves loaded from m..

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static SimdFloat gmx_simdcall gmx::loadDuplicateHsimd ( const float * m )

inlinestatic

Load half-SIMD-width float data, spread to both halves.

Parameters

m	Pointer to memory aligned to half SIMD width.

Returns: SIMD variable with both halves loaded from m..

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

template<typename T , typename TSimd , int simdWidth>

void gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadStoreTester	(	TSimd gmx_simdcall	loadFnconst T *mem,
		void gmx_simdcall	storeFnT *mem, TSimd,
		const int	loadOffset,
		const int	storeOffset
	)

Generic routine to test load & store of SIMD, and check for side effects.

The tests for load, store, unaligned load and unaligned store both for real and int are pretty much similar, so we use a template function with additional function pointers for the actual load/store calls.

static SimdDouble gmx_simdcall gmx::loadU1DualHsimd ( const double * m )

inlinestatic

Load two doubles, spread 1st in low half, 2nd in high half.

Parameters

m	Pointer to two adjacent double values.

Returns: SIMD variable where all elements in the low half have been set to m[0], and all elements in high half to m[1].

Note: This routine always loads two values and sets the halves separately. If you want to set all elements to the same value, simply use the standard (non-half-SIMD) operations.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static SimdFloat gmx_simdcall gmx::loadU1DualHsimd ( const float * m )

inlinestatic

Load two floats, spread 1st in low half, 2nd in high half.

Parameters

m	Pointer to two adjacent float values.

Returns: SIMD variable where all elements in the low half have been set to m[0], and all elements in high half to m[1].

Note: This routine always loads two values and sets the halves separately. If you want to set all elements to the same value, simply use the standard (non-half-SIMD) operations.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static SimdDouble gmx_simdcall gmx::loadU4NOffset	(	const double *	m,
		int	offset
	)

inlinestatic

Load doubles in blocks of 4 at fixed offsets.

Parameters

m	Pointer to unaligned memory
offset	Offset in memory between input blocks of 4

Returns: SIMD variable with doubles from m.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_DOUBLE is 1. Blocks of 4 doubles are loaded from m+n*offset where n is the n-th block of 4 doubles.

static SimdFloat gmx_simdcall gmx::loadU4NOffset	(	const float *	m,
		int	offset
	)

inlinestatic

Load floats in blocks of 4 at fixed offsets.

Parameters

m	Pointer to unaligned memory
offset	Offset in memory between input blocks of 4

Returns: SIMD variable with floats from m.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_FLOAT is 1. Blocks of 4 floats are loaded from m+n*offset where n is the n-th block of 4 floats.

static SimdDouble gmx_simdcall gmx::loadUNDuplicate4 ( const double * m )

inlinestatic

Load N doubles and duplicate them 4 times each.

Parameters

m	Pointer to unaligned memory

Returns: SIMD variable with N doubles from m duplicated 4x.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_DOUBLE is 1. N is GMX_SIMD_DOUBLE_WIDTH/4. Duplicated values are contigous and different values are 4 positions in SIMD apart.

static SimdFloat gmx_simdcall gmx::loadUNDuplicate4 ( const float * m )

inlinestatic

Load N floats and duplicate them 4 times each.

Parameters

m	Pointer to unaligned memory

Returns: SIMD variable with N floats from m duplicated 4x.

Available if GMX_SIMD_HAVE_4NSIMD_UTIL_FLOAT is 1. N is GMX_SIMD_FLOAT_WIDTH/4. Duplicated values are contigous and different values are 4 positions in SIMD apart.

template<typename T , typename TSimd >

TSimd gmx_simdcall gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadUWrapper ( const T * m )

Wrapper to handle proxy objects returned by some loadU functions.

Template Parameters

T	Type of scalar object
TSimd	Corresponding SIMD type

Parameters

m	Memory address to load from

template<typename T , typename TSimd >

TSimd gmx_simdcall gmx::test::anonymous_namespace{bootstrap_loadstore.cpp}::loadWrapper ( const T * m )

Wrapper to handle proxy objects returned by some load functions.

Template Parameters

T	Type of scalar object
TSimd	Corresponding SIMD type

Parameters

m	Memory address to load from

static SimdFloat gmx_simdcall gmx::log ( SimdFloat x )

inlinestatic

SIMD float log(x). This is the natural logarithm.

Parameters

x	Argument, should be >0.

Returns: The natural logarithm of x. Undefined if argument is invalid.

static SimdDouble gmx_simdcall gmx::log ( SimdDouble x )

inlinestatic

SIMD double log(x). This is the natural logarithm.

Parameters

x	Argument, should be >0.

Returns: The natural logarithm of x. Undefined if argument is invalid.

static SimdFloat gmx_simdcall gmx::log2 ( SimdFloat x )

inlinestatic

SIMD float log2(x). This is the base-2 logarithm.

Parameters

x	Argument, should be >0.

Returns: The base-2 logarithm of x. Undefined if argument is invalid.

static SimdDouble gmx_simdcall gmx::log2 ( SimdDouble x )

inlinestatic

SIMD double log2(x). This is the base-2 logarithm.

Parameters

x	Argument, should be >0.

Returns: The base-2 logarithm of x. Undefined if argument is invalid.

static SimdDouble gmx_simdcall gmx::log2SingleAccuracy ( SimdDouble x )

inlinestatic

SIMD log2(x). Double precision SIMD data, single accuracy.

Parameters

x	Argument, should be >0.

Returns: The base 2 logarithm of x. Undefined if argument is invalid.

static SimdFloat gmx_simdcall gmx::log2SingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float log2(x), only targeting single accuracy. This is the base-2 logarithm.

Parameters

x	Argument, should be >0.

Returns: The base-2 logarithm of x. Undefined if argument is invalid.

static SimdDouble gmx_simdcall gmx::logSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD log(x). Double precision SIMD data, single accuracy.

Parameters

x	Argument, should be >0.

Returns: The natural logarithm of x. Undefined if argument is invalid.

static SimdFloat gmx_simdcall gmx::logSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float log(x), only targeting single accuracy. This is the natural logarithm.

Parameters

x	Argument, should be >0.

Returns: The natural logarithm of x. Undefined if argument is invalid.

static SimdFloat gmx_simdcall gmx::maskAdd	(	SimdFloat	a,
		SimdFloat	b,
		SimdFBool	m
	)

inlinestatic

Add two float SIMD variables, masked version.

Parameters

a	term1
b	term2
m	mask

Returns: a+b where mask is true, a otherwise.

static SimdDouble gmx_simdcall gmx::maskAdd	(	SimdDouble	a,
		SimdDouble	b,
		SimdDBool	m
	)

inlinestatic

Add two double SIMD variables, masked version.

Parameters

a	term1
b	term2
m	mask

Returns: a+b where mask is true, 0.0 otherwise.

static SimdFloat gmx_simdcall gmx::maskzFma	(	SimdFloat	a,
		SimdFloat	b,
		SimdFloat	c,
		SimdFBool	m
	)

inlinestatic

SIMD float fused multiply-add, masked version.

Parameters

a	factor1
b	factor2
c	term
m	mask

Returns: a*b+c where mask is true, 0.0 otherwise.

static SimdDouble gmx_simdcall gmx::maskzFma	(	SimdDouble	a,
		SimdDouble	b,
		SimdDouble	c,
		SimdDBool	m
	)

inlinestatic

SIMD double fused multiply-add, masked version.

Parameters

a	factor1
b	factor2
c	term
m	mask

Returns: a*b+c where mask is true, 0.0 otherwise.

static SimdFloat gmx_simdcall gmx::maskzInv	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

Calculate 1/x for SIMD float, masked version.

Parameters

x	Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX for masked-in entries. See invsqrt for the discussion about argument restrictions.
m	Mask

Returns: 1/x for elements where m is true, or 0.0 for masked-out entries.

static SimdDouble gmx_simdcall gmx::maskzInv	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

Calculate 1/x for SIMD double, masked version.

Parameters

x	Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX for masked-in entries. See invsqrt for the discussion about argument restrictions.
m	Mask

Returns: 1/x for elements where m is true, or 0.0 for masked-out entries.

static SimdDouble gmx_simdcall gmx::maskzInvSingleAccuracy	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

1/x for masked entries of SIMD double, single accuracy.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

m Mask

Returns: 1/x for elements where m is true, or 0.0 for masked-out entries.

static SimdFloat gmx::maskzInvSingleAccuracy	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

Calculate 1/x for masked SIMD floats, only targeting single accuracy.

Parameters

x Argument with magnitude larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

m Mask

Returns: 1/x for elements where m is true, or 0.0 for masked-out entries.

static SimdFloat gmx::maskzInvsqrt	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

Calculate 1/sqrt(x) for masked entries of SIMD float.

This routine only evaluates 1/sqrt(x) for elements for which mask is true. Illegal values in the masked-out elements will not lead to floating-point exceptions.

Parameters

x	Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX for masked-in entries. See invsqrt for the discussion about argument restrictions.
m	Mask

Returns: 1/sqrt(x). Result is undefined if your argument was invalid or entry was not masked, and 0.0 for masked-out entries.

static SimdDouble gmx::maskzInvsqrt	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

Calculate 1/sqrt(x) for masked entries of SIMD double.

This routine only evaluates 1/sqrt(x) for elements for which mask is true. Illegal values in the masked-out elements will not lead to floating-point exceptions.

Parameters

x	Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX for masked-in entries. See invsqrt for the discussion about argument restrictions.
m	Mask

Returns: 1/sqrt(x). Result is undefined if your argument was invalid or entry was not masked, and 0.0 for masked-out entries.

static SimdDouble gmx::maskzInvsqrtSingleAccuracy	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

1/sqrt(x) for masked-in entries of SIMD double, but in single accuracy.

This routine only evaluates 1/sqrt(x) for elements for which mask is true. Illegal values in the masked-out elements will not lead to floating-point exceptions.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

m Mask

Returns: 1/sqrt(x). Result is undefined if your argument was invalid or entry was not masked, and 0.0 for masked-out entries.

static SimdFloat gmx::maskzInvsqrtSingleAccuracy	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

Calculate 1/sqrt(x) for masked SIMD floats, only targeting single accuracy.

This routine only evaluates 1/sqrt(x) for elements for which mask is true. Illegal values in the masked-out elements will not lead to floating-point exceptions.

Parameters

x Argument that must be larger than GMX_FLOAT_MIN and smaller than GMX_FLOAT_MAX, i.e. within the range of single precision. For the single precision implementation this is obviously always true for positive values, but for double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

m Mask

Returns: 1/sqrt(x). Result is undefined if your argument was invalid or entry was not masked, and 0.0 for masked-out entries.

static SimdFloat gmx_simdcall gmx::maskzMul	(	SimdFloat	a,
		SimdFloat	b,
		SimdFBool	m
	)

inlinestatic

Multiply two float SIMD variables, masked version.

Parameters

a	factor1
b	factor2
m	mask

Returns: a*b where mask is true, 0.0 otherwise.

static SimdDouble gmx_simdcall gmx::maskzMul	(	SimdDouble	a,
		SimdDouble	b,
		SimdDBool	m
	)

inlinestatic

Multiply two double SIMD variables, masked version.

Parameters

a	factor1
b	factor2
m	mask

Returns: a*b where mask is true, 0.0 otherwise.

static SimdFloat gmx_simdcall gmx::maskzRcp	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

SIMD float 1.0/x lookup, masked version.

This is a low-level instruction that should only be called from routines implementing the reciprocal in simd_math.h.

Parameters

x	Argument, x>0 for entries where mask is true.
m	Mask

Returns: Approximation of 1/x, accuracy is GMX_SIMD_RCP_BITS. The result for masked-out entries will be 0.0.

static SimdDouble gmx_simdcall gmx::maskzRcp	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

SIMD double 1.0/x lookup, masked version.

This is a low-level instruction that should only be called from routines implementing the reciprocal in simd_math.h.

Parameters

x	Argument, x>0 for entries where mask is true.
m	Mask

Returns: Approximation of 1/x, accuracy is GMX_SIMD_RCP_BITS. The result for masked-out entries will be 0.0.

static SimdFloat gmx_simdcall gmx::maskzRsqrt	(	SimdFloat	x,
		SimdFBool	m
	)

inlinestatic

SIMD float 1.0/sqrt(x) lookup, masked version.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0 for entries where mask is true.
m	Mask

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS. The result for masked-out entries will be 0.0.

static SimdDouble gmx_simdcall gmx::maskzRsqrt	(	SimdDouble	x,
		SimdDBool	m
	)

inlinestatic

SIMD double 1.0/sqrt(x) lookup, masked version.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0 for entries where mask is true.
m	Mask

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS. The result for masked-out entries will be 0.0.

static Simd4Float gmx_simdcall gmx::max	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Set each SIMD4 element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static Simd4Double gmx_simdcall gmx::max	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Set each SIMD4 element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static SimdFloat gmx_simdcall gmx::max	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Set each SIMD float element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static SimdDouble gmx_simdcall gmx::max	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Set each SIMD double element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static Simd4Float gmx_simdcall gmx::min	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Set each SIMD4 element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static Simd4Double gmx_simdcall gmx::min	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Set each SIMD4 element to the largest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: max(a,b) for each element.

static SimdFloat gmx_simdcall gmx::min	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Set each SIMD float element to the smallest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: min(a,b) for each element.

static SimdDouble gmx_simdcall gmx::min	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Set each SIMD double element to the smallest from two variables.

Parameters

a	Any floating-point value
b	Any floating-point value

Returns: min(a,b) for each element.

static SimdFloat gmx_simdcall gmx::norm2	(	SimdFloat	ax,
		SimdFloat	ay,
		SimdFloat	az
	)

inlinestatic

SIMD float norm squared of multiple vectors.

Parameters

ax	X components of vectors
ay	Y components of vectors
az	Z components of vectors

Returns: Element i will be res[i] = ax[i]*ax[i]+ay[i]*ay[i]+az[i]*az[i].

Note: This corresponds to the scalar product of the vector with itself, but the compiler might be able to optimize it better with identical vectors.

static SimdDouble gmx_simdcall gmx::norm2	(	SimdDouble	ax,
		SimdDouble	ay,
		SimdDouble	az
	)

inlinestatic

SIMD double norm squared of multiple vectors.

Parameters

ax	X components of vectors
ay	Y components of vectors
az	Z components of vectors

Returns: Element i will be res[i] = ax[i]*ax[i]+ay[i]*ay[i]+az[i]*az[i].

Note: This corresponds to the scalar product of the vector with itself, but the compiler might be able to optimize it better with identical vectors.

static Simd4Float gmx_simdcall gmx::norm2	(	Simd4Float	ax,
		Simd4Float	ay,
		Simd4Float	az
	)

inlinestatic

SIMD4 float norm squared of multiple vectors.

Parameters

ax	X components of vectors
ay	Y components of vectors
az	Z components of vectors

Returns: Element i will be res[i] = ax[i]*ax[i]+ay[i]*ay[i]+az[i]*az[i].

Note: This corresponds to the scalar product of the vector with itself, but the compiler might be able to optimize it better with identical vectors.

static Simd4Double gmx_simdcall gmx::norm2	(	Simd4Double	ax,
		Simd4Double	ay,
		Simd4Double	az
	)

inlinestatic

SIMD4 double norm squared of multiple vectors.

Parameters

ax	X components of vectors
ay	Y components of vectors
az	Z components of vectors

Returns: Element i will be res[i] = ax[i]*ax[i]+ay[i]*ay[i]+az[i]*az[i].

Note: This corresponds to the scalar product of the vector with itself, but the compiler might be able to optimize it better with identical vectors.

static Simd4FBool gmx_simdcall gmx::operator!=	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

a!=b for SIMD4 float

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a!=b.

static Simd4DBool gmx_simdcall gmx::operator!=	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

a!=b for SIMD4 double

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a!=b.

static SimdFBool gmx_simdcall gmx::operator!=	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

SIMD a!=b for single SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a!=b.

Beware that exact floating-point comparisons are difficult.

static SimdDBool gmx_simdcall gmx::operator!=	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

SIMD a!=b for double SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a!=b.

Beware that exact floating-point comparisons are difficult.

static Simd4Float gmx_simdcall gmx::operator&	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Bitwise and for two SIMD4 float variables.

Supported if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 & data2

static Simd4Double gmx_simdcall gmx::operator&	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Bitwise and for two SIMD4 double variables.

Supported if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 & data2

static SimdFloat gmx_simdcall gmx::operator&	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Bitwise and for two SIMD float variables.

Supported if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 & data2

static SimdDouble gmx_simdcall gmx::operator&	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Bitwise and for two SIMD double variables.

Supported if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 & data2

static SimdFInt32 gmx_simdcall gmx::operator&	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Integer SIMD bitwise and.

Available if GMX_SIMD_HAVE_FINT32_LOGICAL is 1.

Note: You can not use this operation directly to select based on a boolean SIMD variable, since booleans are separate from integer SIMD. If that is what you need, have a look at gmx::selectByMask instead.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a & b (bitwise and)

static SimdDInt32 gmx_simdcall gmx::operator&	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Integer SIMD bitwise and.

Available if GMX_SIMD_HAVE_DINT32_LOGICAL is 1.

Note: You can not use this operation directly to select based on a boolean SIMD variable, since booleans are separate from integer SIMD. If that is what you need, have a look at gmx::selectByMask instead.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a & b (bitwise and)

static Simd4FBool gmx_simdcall gmx::operator&&	(	Simd4FBool	a,
		Simd4FBool	b
	)

inlinestatic

Logical and on single precision SIMD4 booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a & b are true.

Note: This is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static Simd4DBool gmx_simdcall gmx::operator&&	(	Simd4DBool	a,
		Simd4DBool	b
	)

inlinestatic

Logical and on single precision SIMD4 booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a & b are true.

Note: This is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static SimdFBool gmx_simdcall gmx::operator&&	(	SimdFBool	a,
		SimdFBool	b
	)

inlinestatic

Logical and on single precision SIMD booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a & b are true.

Note: This is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static SimdDBool gmx_simdcall gmx::operator&&	(	SimdDBool	a,
		SimdDBool	b
	)

inlinestatic

Logical and on double precision SIMD booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a & b are true.

Note: This is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static SimdFIBool gmx_simdcall gmx::operator&&	(	SimdFIBool	a,
		SimdFIBool	b
	)

inlinestatic

Logical AND on SimdFIBool.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD boolean 1
b	SIMD boolean 2

Returns: True for elements where both a and b are true.

static SimdDIBool gmx_simdcall gmx::operator&&	(	SimdDIBool	a,
		SimdDIBool	b
	)

inlinestatic

Logical AND on SimdDIBool.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD boolean 1
b	SIMD boolean 2

Returns: True for elements where both a and b are true.

static Simd4Float gmx_simdcall gmx::operator*	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Multiply two SIMD4 variables.

Parameters

a	factor1
b	factor2

Returns: a*b.

static Simd4Double gmx_simdcall gmx::operator*	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Multiply two SIMD4 variables.

Parameters

a	factor1
b	factor2

Returns: a*b.

static SimdFloat gmx_simdcall gmx::operator*	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Multiply two float SIMD variables.

Parameters

a	factor1
b	factor2

Returns: a*b.

static SimdDouble gmx_simdcall gmx::operator*	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Multiply two double SIMD variables.

Parameters

a	factor1
b	factor2

Returns: a*b.

static SimdFInt32 gmx_simdcall gmx::operator*	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Multiply SIMD integers.

This routine is only available if GMX_SIMD_HAVE_FINT32_ARITHMETICS (single) or GMX_SIMD_HAVE_DINT32_ARITHMETICS (double) is 1.

Parameters

a	factor1
b	factor2

Returns: a*b.

Note: Only the low 32 bits are retained, so this can overflow.

static SimdDInt32 gmx_simdcall gmx::operator*	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Multiply SIMD integers.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	factor1
b	factor2

Returns: a*b.

Note: Only the low 32 bits are retained, so this can overflow.

static Simd4Double gmx_simdcall gmx::operator+	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Add two double SIMD4 variables.

Parameters

a	term1
b	term2

Returns: a+b

static Simd4Float gmx_simdcall gmx::operator+	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Add two float SIMD4 variables.

Parameters

a	term1
b	term2

Returns: a+b

static SimdFloat gmx_simdcall gmx::operator+	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Add two float SIMD variables.

Parameters

a	term1
b	term2

Returns: a+b

static SimdDouble gmx_simdcall gmx::operator+	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Add two double SIMD variables.

Parameters

a	term1
b	term2

Returns: a+b

static SimdFInt32 gmx_simdcall gmx::operator+	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Add SIMD integers.

This routine is only available if GMX_SIMD_HAVE_FINT32_ARITHMETICS (single) or GMX_SIMD_HAVE_DINT32_ARITHMETICS (double) is 1.

Parameters

a	term1
b	term2

Returns: a+b

static SimdDInt32 gmx_simdcall gmx::operator+	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Add SIMD integers.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	term1
b	term2

Returns: a+b

static Simd4Double gmx_simdcall gmx::operator-	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Subtract two SIMD4 variables.

Parameters

a	term1
b	term2

Returns: a-b

static Simd4Float gmx_simdcall gmx::operator-	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Subtract two SIMD4 variables.

Parameters

a	term1
b	term2

Returns: a-b

static Simd4Double gmx_simdcall gmx::operator- ( Simd4Double a )

inlinestatic

SIMD4 floating-point negate.

Parameters

a	SIMD4 floating-point value

Returns: -a

static Simd4Float gmx_simdcall gmx::operator- ( Simd4Float a )

inlinestatic

SIMD4 floating-point negate.

Parameters

a	SIMD4 floating-point value

Returns: -a

static SimdFloat gmx_simdcall gmx::operator-	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Subtract two float SIMD variables.

Parameters

a	term1
b	term2

Returns: a-b

static SimdDouble gmx_simdcall gmx::operator-	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Subtract two double SIMD variables.

Parameters

a	term1
b	term2

Returns: a-b

static SimdFloat gmx_simdcall gmx::operator- ( SimdFloat a )

inlinestatic

SIMD single precision negate.

Parameters

a	SIMD double precision value

Returns: -a

static SimdDouble gmx_simdcall gmx::operator- ( SimdDouble a )

inlinestatic

SIMD double precision negate.

Parameters

a	SIMD double precision value

Returns: -a

static SimdFInt32 gmx_simdcall gmx::operator-	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Subtract SIMD integers.

This routine is only available if GMX_SIMD_HAVE_FINT32_ARITHMETICS (single) or GMX_SIMD_HAVE_DINT32_ARITHMETICS (double) is 1.

Parameters

a	term1
b	term2

Returns: a-b

static SimdDInt32 gmx_simdcall gmx::operator-	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Subtract SIMD integers.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	term1
b	term2

Returns: a-b

static SimdFloat gmx_simdcall gmx::operator/	(	SimdFloat	nom,
		SimdFloat	denom
	)

inlinestatic

Division for SIMD floats.

Parameters

nom	Nominator
denom	Denominator, with magnitude in range (GMX_FLOAT_MIN,GMX_FLOAT_MAX). For single precision this is equivalent to a nonzero argument, but in double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: nom/denom

Note: This function does not use any masking to avoid problems with zero values in the denominator.

static SimdDouble gmx_simdcall gmx::operator/	(	SimdDouble	nom,
		SimdDouble	denom
	)

inlinestatic

Division for SIMD doubles.

Parameters

nom	Nominator
denom	Denominator, with magnitude in range (GMX_FLOAT_MIN,GMX_FLOAT_MAX). For single precision this is equivalent to a nonzero argument, but in double precision it adds an extra restriction since the first lookup step might have to be performed in single precision on some architectures. Note that the responsibility for checking falls on you - this routine does not check arguments.

Returns: nom/denom

Note: This function does not use any masking to avoid problems with zero values in the denominator.

static Simd4FBool gmx_simdcall gmx::operator<	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

a<b for SIMD4 float

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<b.

static Simd4DBool gmx_simdcall gmx::operator<	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

a<b for SIMD4 double

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<b.

static SimdFBool gmx_simdcall gmx::operator<	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

SIMD a<b for single SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<b.

static SimdDBool gmx_simdcall gmx::operator<	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

SIMD a<b for double SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<b.

static SimdFIBool gmx_simdcall gmx::operator<	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Less-than comparison of two SIMD integers corresponding to float values.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer1
b	SIMD integer2

Returns: SIMD integer boolean with true for elements where a<b

static SimdDIBool gmx_simdcall gmx::operator<	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Less-than comparison of two SIMD integers corresponding to double values.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer1
b	SIMD integer2

Returns: SIMD integer boolean with true for elements where a<b

static Simd4FBool gmx_simdcall gmx::operator<=	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

a<=b for SIMD4 float.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<=b.

static Simd4DBool gmx_simdcall gmx::operator<=	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

a<=b for SIMD4 double.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<=b.

static SimdFBool gmx_simdcall gmx::operator<=	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

SIMD a<=b for single SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<=b.

static SimdDBool gmx_simdcall gmx::operator<=	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

SIMD a<=b for double SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a<=b.

static Simd4FBool gmx_simdcall gmx::operator==	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

a==b for SIMD4 float

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a==b.

static Simd4DBool gmx_simdcall gmx::operator==	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

a==b for SIMD4 double

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a==b.

static SimdFBool gmx_simdcall gmx::operator==	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

SIMD a==b for single SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a==b.

Beware that exact floating-point comparisons are difficult.

static SimdDBool gmx_simdcall gmx::operator==	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

SIMD a==b for double SIMD.

Parameters

a	value1
b	value2

Returns: Each element of the boolean will be set to true if a==b.

Beware that exact floating-point comparisons are difficult.

static SimdFIBool gmx_simdcall gmx::operator==	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Equality comparison of two integers corresponding to float values.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer1
b	SIMD integer2

Returns: SIMD integer boolean with true for elements where a==b

static SimdDIBool gmx_simdcall gmx::operator==	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Equality comparison of two integers corresponding to double values.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer1
b	SIMD integer2

Returns: SIMD integer boolean with true for elements where a==b

static Simd4Double gmx_simdcall gmx::operator^	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Bitwise xor for two SIMD4 double variables.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 ^ data2

static Simd4Float gmx_simdcall gmx::operator^	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Bitwise xor for two SIMD4 float variables.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 ^ data2

static SimdFloat gmx_simdcall gmx::operator^	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Bitwise xor for SIMD float.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 ^ data2

static SimdDouble gmx_simdcall gmx::operator^	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Bitwise xor for SIMD double.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 ^ data2

static SimdFInt32 gmx_simdcall gmx::operator^	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Integer SIMD bitwise xor.

Available if GMX_SIMD_HAVE_FINT32_LOGICAL is 1.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a ^ b (bitwise xor)

static SimdDInt32 gmx_simdcall gmx::operator^	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Integer SIMD bitwise xor.

Available if GMX_SIMD_HAVE_DINT32_LOGICAL is 1.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a ^ b (bitwise xor)

static Simd4Double gmx_simdcall gmx::operator\|	(	Simd4Double	a,
		Simd4Double	b
	)

inlinestatic

Bitwise or for two SIMD4 doubles.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 | data2

static Simd4Float gmx_simdcall gmx::operator\|	(	Simd4Float	a,
		Simd4Float	b
	)

inlinestatic

Bitwise or for two SIMD4 floats.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 | data2

static SimdFloat gmx_simdcall gmx::operator\|	(	SimdFloat	a,
		SimdFloat	b
	)

inlinestatic

Bitwise or for SIMD float.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 | data2

static SimdDouble gmx_simdcall gmx::operator\|	(	SimdDouble	a,
		SimdDouble	b
	)

inlinestatic

Bitwise or for SIMD double.

Available if GMX_SIMD_HAVE_LOGICAL is 1.

Parameters

a	data1
b	data2

Returns: data1 | data2

static SimdFInt32 gmx_simdcall gmx::operator\|	(	SimdFInt32	a,
		SimdFInt32	b
	)

inlinestatic

Integer SIMD bitwise or.

Available if GMX_SIMD_HAVE_FINT32_LOGICAL is 1.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a | b (bitwise or)

static SimdDInt32 gmx_simdcall gmx::operator\|	(	SimdDInt32	a,
		SimdDInt32	b
	)

inlinestatic

Integer SIMD bitwise or.

Available if GMX_SIMD_HAVE_DINT32_LOGICAL is 1.

Parameters

a	first integer SIMD
b	second integer SIMD

Returns: a | b (bitwise or)

static Simd4FBool gmx_simdcall gmx::operator\|\|	(	Simd4FBool	a,
		Simd4FBool	b
	)

inlinestatic

Logical or on single precision SIMD4 booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a or b is true.

Note that this is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static Simd4DBool gmx_simdcall gmx::operator\|\|	(	Simd4DBool	a,
		Simd4DBool	b
	)

inlinestatic

Logical or on single precision SIMD4 booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a or b is true.

Note that this is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

static SimdFBool gmx_simdcall gmx::operator\|\|	(	SimdFBool	a,
		SimdFBool	b
	)

inlinestatic

Logical or on single precision SIMD booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a or b is true.

Note that this is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

\

static SimdDBool gmx_simdcall gmx::operator\|\|	(	SimdDBool	a,
		SimdDBool	b
	)

inlinestatic

Logical or on double precision SIMD booleans.

Parameters

a	logical vars 1
b	logical vars 2

Returns: For each element, the result boolean is true if a or b is true.

Note that this is not necessarily a bitwise operation - the storage format of booleans is implementation-dependent.

\

static SimdFIBool gmx_simdcall gmx::operator\|\|	(	SimdFIBool	a,
		SimdFIBool	b
	)

inlinestatic

Logical OR on SimdFIBool.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD boolean 1
b	SIMD boolean 2

Returns: True for elements where both a and b are true.

static SimdDIBool gmx_simdcall gmx::operator\|\|	(	SimdDIBool	a,
		SimdDIBool	b
	)

inlinestatic

Logical OR on SimdDIBool.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD boolean 1
b	SIMD boolean 2

Returns: True for elements where both a and b are true.

static SimdFloat gmx_simdcall gmx::pmeForceCorrection ( SimdFloat z2 )

inlinestatic

Calculate the force correction due to PME analytically in SIMD float.

Parameters

z2	$(r \beta)^2$ - see below for details.

Returns: Correction factor to coulomb force - see below for details.

This routine is meant to enable analytical evaluation of the direct-space PME electrostatic force to avoid tables.

The direct-space potential should be $\mbox{erfc}(\beta r)/r$ , but there are some problems evaluating that:

First, the error function is difficult (read: expensive) to approxmiate accurately for intermediate to large arguments, and this happens already in ranges of $(\beta r)$ that occur in simulations. Second, we now try to avoid calculating potentials in Gromacs but use forces directly.

We can simply things slight by noting that the PME part is really a correction to the normal Coulomb force since $\mbox{erfc}(z)=1-\mbox{erf}(z)$ , i.e.

$V = \frac{1}{r} - \frac{\mbox{erf}(\beta r)}{r}$

The first term we already have from the inverse square root, so that we can leave out of this routine.

For pme tolerances of 1e-3 to 1e-8 and cutoffs of 0.5nm to 1.8nm, the argument $beta r$ will be in the range 0.15 to ~4, which is the range used for the minimax fit. Use your favorite plotting program to realize how well-behaved $\frac{\mbox{erf}(z)}{z}$ is in this range!

We approximate $f(z)=\mbox{erf}(z)/z$ with a rational minimax polynomial. However, it turns out it is more efficient to approximate $f(z)/z$ and then only use even powers. This is another minor optimization, since we actually want , because it is going to be multiplied by the vector between the two atoms to get the vectorial force. The fastest flops are the ones we can avoid calculating!

So, here's how it should be used:

Calculate .
Multiply by $\beta^2$ , so you get $z^2=(\beta r)^2$ .
Evaluate this routine with as the argument.
The return value is the expression:

$\frac{2 \exp{-z^2}}{\sqrt{\pi} z^2}-\frac{\mbox{erf}(z)}{z^3}$
Multiply the entire expression by $\beta^3$ . This will get you

$\frac{2 \beta^3 \exp(-z^2)}{\sqrt{\pi} z^2} - \frac{\beta^3 \mbox{erf}(z)}{z^3}$

or, switching back to (since $z=r \beta$ ):

$\frac{2 \beta \exp(-r^2 \beta^2)}{\sqrt{\pi} r^2} - \frac{\mbox{erf}(r \beta)}{r^3}$

With a bit of math exercise you should be able to confirm that this is exactly

$\frac{\frac{d}{dr}\left( \frac{\mbox{erf}(\beta r)}{r} \right)}{r}$
Add the result to $r^{-3}$ , multiply by the product of the charges, and you have your force (divided by ). A final multiplication with the vector connecting the two particles and you have your vectorial force to add to the particles.

This approximation achieves an error slightly lower than 1e-6 in single precision and 1e-11 in double precision for arguments smaller than 16 ( $\beta r \leq 4$ ); when added to $1/r$ the error will be insignificant. For $\beta r \geq 7206$ the return value can be inf or NaN.

static SimdDouble gmx_simdcall gmx::pmeForceCorrection ( SimdDouble z2 )

inlinestatic

Calculate the force correction due to PME analytically in SIMD double.

Parameters

z2	This should be the value $(r \beta)^2$ , where r is your interaction distance and beta the ewald splitting parameters.

Returns: Correction factor to coulomb force.

This routine is meant to enable analytical evaluation of the direct-space PME electrostatic force to avoid tables. For details, see the single precision function.

static SimdDouble gmx_simdcall gmx::pmeForceCorrectionSingleAccuracy ( SimdDouble z2 )

inlinestatic

Analytical PME force correction, double SIMD data, single accuracy.

Parameters

z2	$(r \beta)^2$ - see below for details.

Returns: Correction factor to coulomb force - see below for details.

This routine is meant to enable analytical evaluation of the direct-space PME electrostatic force to avoid tables.

The direct-space potential should be $\mbox{erfc}(\beta r)/r$ , but there are some problems evaluating that:

First, the error function is difficult (read: expensive) to approxmiate accurately for intermediate to large arguments, and this happens already in ranges of $(\beta r)$ that occur in simulations. Second, we now try to avoid calculating potentials in Gromacs but use forces directly.

We can simply things slight by noting that the PME part is really a correction to the normal Coulomb force since $\mbox{erfc}(z)=1-\mbox{erf}(z)$ , i.e.

$V = \frac{1}{r} - \frac{\mbox{erf}(\beta r)}{r}$

The first term we already have from the inverse square root, so that we can leave out of this routine.

For pme tolerances of 1e-3 to 1e-8 and cutoffs of 0.5nm to 1.8nm, the argument $beta r$ will be in the range 0.15 to ~4. Use your favorite plotting program to realize how well-behaved $\frac{\mbox{erf}(z)}{z}$ is in this range!

We approximate $f(z)=\mbox{erf}(z)/z$ with a rational minimax polynomial. However, it turns out it is more efficient to approximate $f(z)/z$ and then only use even powers. This is another minor optimization, since we actually want , because it is going to be multiplied by the vector between the two atoms to get the vectorial force. The fastest flops are the ones we can avoid calculating!

So, here's how it should be used:

Calculate .
Multiply by $\beta^2$ , so you get $z^2=(\beta r)^2$ .
Evaluate this routine with as the argument.
The return value is the expression:

$\frac{2 \exp{-z^2}}{\sqrt{\pi} z^2}-\frac{\mbox{erf}(z)}{z^3}$
Multiply the entire expression by $\beta^3$ . This will get you

$\frac{2 \beta^3 \exp(-z^2)}{\sqrt{\pi} z^2} - \frac{\beta^3 \mbox{erf}(z)}{z^3}$

or, switching back to (since $z=r \beta$ ):

$\frac{2 \beta \exp(-r^2 \beta^2)}{\sqrt{\pi} r^2} - \frac{\mbox{erf}(r \beta)}{r^3}$

With a bit of math exercise you should be able to confirm that this is exactly

$\frac{\frac{d}{dr}\left( \frac{\mbox{erf}(\beta r)}{r} \right)}{r}$
Add the result to $r^{-3}$ , multiply by the product of the charges, and you have your force (divided by ). A final multiplication with the vector connecting the two particles and you have your vectorial force to add to the particles.

This approximation achieves an accuracy slightly lower than 1e-6; when added to $1/r$ the error will be insignificant.

static SimdFloat gmx_simdcall gmx::pmeForceCorrectionSingleAccuracy ( SimdFloat z2 )

inlinestatic

SIMD Analytic PME force correction, only targeting single accuracy.

Parameters

z2	$(r \beta)^2$ - see default single precision version for details.

Returns: Correction factor to coulomb force.

static SimdFloat gmx_simdcall gmx::pmePotentialCorrection ( SimdFloat z2 )

inlinestatic

Calculate the potential correction due to PME analytically in SIMD float.

Parameters

z2	$(r \beta)^2$ - see below for details.

Returns: Correction factor to coulomb potential - see below for details.

See pmeForceCorrection for details about the approximation.

This routine calculates $\mbox{erf}(z)/z$ , although you should provide $z^2$ as the input argument.

Here's how it should be used:

Calculate .
Multiply by $\beta^2$ , so you get $z^2=\beta^2*r^2$ .
Evaluate this routine with z^2 as the argument.
The return value is the expression:

$\frac{\mbox{erf}(z)}{z}$
Multiply the entire expression by beta and switching back to (since $z=r \beta$ ):

$\frac{\mbox{erf}(r \beta)}{r}$
Subtract the result from , multiply by the product of the charges, and you have your potential.

This approximation achieves an error slightly lower than 1e-6 in single precision and 4e-11 in double precision for arguments smaller than 16 ( $0.15 \leq \beta r \leq 4$ ); for $\beta r \leq 0.15$ the error can be twice as high; when added to $1/r$ the error will be insignificant. For $\beta r \geq 7142$ the return value can be inf or NaN.

static SimdDouble gmx_simdcall gmx::pmePotentialCorrection ( SimdDouble z2 )

inlinestatic

Calculate the potential correction due to PME analytically in SIMD double.

Parameters

z2	This should be the value $(r \beta)^2$ , where r is your interaction distance and beta the ewald splitting parameters.

Returns: Correction factor to coulomb force.

This routine is meant to enable analytical evaluation of the direct-space PME electrostatic potential to avoid tables. For details, see the single precision function.

static SimdDouble gmx_simdcall gmx::pmePotentialCorrectionSingleAccuracy ( SimdDouble z2 )

inlinestatic

Analytical PME potential correction, double SIMD data, single accuracy.

Parameters

z2	$(r \beta)^2$ - see below for details.

Returns: Correction factor to coulomb potential - see below for details.

This routine calculates $\mbox{erf}(z)/z$ , although you should provide $z^2$ as the input argument.

Here's how it should be used:

Calculate .
Multiply by $\beta^2$ , so you get $z^2=\beta^2*r^2$ .
Evaluate this routine with z^2 as the argument.
The return value is the expression:

$\frac{\mbox{erf}(z)}{z}$
Multiply the entire expression by beta and switching back to (since $z=r \beta$ ):

$\frac{\mbox{erf}(r \beta)}{r}$
Subtract the result from , multiply by the product of the charges, and you have your potential.

This approximation achieves an accuracy slightly lower than 1e-6; when added to $1/r$ the error will be insignificant.

static SimdFloat gmx_simdcall gmx::pmePotentialCorrectionSingleAccuracy ( SimdFloat z2 )

inlinestatic

SIMD Analytic PME potential correction, only targeting single accuracy.

Parameters

z2	$(r \beta)^2$ - see default single precision version for details.

Returns: Correction factor to coulomb force.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::pow	(	SimdFloat	x,
		SimdFloat	y
	)

inlinestatic

SIMD float pow(x,y)

This returns x^y for SIMD values.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, there are no guarantees about correct results for x==0.

Parameters

x	Base.
y	exponent.

Returns: x^y. Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for x==0.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::pow	(	SimdDouble	x,
		SimdDouble	y
	)

inlinestatic

SIMD double pow(x,y)

This returns x^y for SIMD values.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, there are no guarantees about correct results for x==0.

Parameters

x	Base.
y	exponent.

Returns: x^y. Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for x==0.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::powSingleAccuracy	(	SimdDouble	x,
		SimdDouble	y
	)

inlinestatic

SIMD pow(x,y). Double precision SIMD data, single accuracy.

This returns x^y for SIMD values.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, there are no guarantees about correct results for x==0.

Parameters

x	Base.
y	exponent.

Returns: x^y. Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for x==0.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::powSingleAccuracy	(	SimdFloat	x,
		SimdFloat	y
	)

inlinestatic

SIMD pow(x,y), only targeting single accuracy.

This returns x^y for SIMD values.

Template Parameters

opt	If this is changed from the default (safe) into the unsafe option, there are no guarantees about correct results for x==0.

Parameters

x	Base.
y	exponent.

Returns: x^y. Overflowing arguments are likely to either return 0 or inf, depending on the underlying implementation. If unsafe optimizations are enabled, this is also true for x==0.

Warning: You cannot rely on this implementation returning inf for arguments that cause overflow. If you have some very large values and need to rely on getting a valid numerical output, take the minimum of your variable and the largest valid argument before calling this routine.

static SimdFloat gmx_simdcall gmx::rcp ( SimdFloat x )

inlinestatic

SIMD float 1.0/x lookup.

This is a low-level instruction that should only be called from routines implementing the reciprocal in simd_math.h.

Parameters

x	Argument, x!=0

Returns: Approximation of 1/x, accuracy is GMX_SIMD_RCP_BITS.

static SimdDouble gmx_simdcall gmx::rcp ( SimdDouble x )

inlinestatic

SIMD double 1.0/x lookup.

This is a low-level instruction that should only be called from routines implementing the reciprocal in simd_math.h.

Parameters

x	Argument, x!=0

Returns: Approximation of 1/x, accuracy is GMX_SIMD_RCP_BITS.

static SimdFloat gmx_simdcall gmx::rcpIter	(	SimdFloat	lu,
		SimdFloat	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/x for SIMD float.

This is a low-level routine that should only be used by SIMD math routine that evaluates the reciprocal.

Parameters

lu	Approximation of 1/x, typically obtained from lookup.
x	The reference (starting) value x for which we want 1/x.

Returns: An improved approximation with roughly twice as many bits of accuracy.

static SimdDouble gmx_simdcall gmx::rcpIter	(	SimdDouble	lu,
		SimdDouble	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/x for SIMD double.

This is a low-level routine that should only be used by SIMD math routine that evaluates the reciprocal.

Parameters

lu	Approximation of 1/x, typically obtained from lookup.
x	The reference (starting) value x for which we want 1/x.

Returns: An improved approximation with roughly twice as many bits of accuracy.

static float gmx_simdcall gmx::reduce ( Simd4Float a )

inlinestatic

Return sum of all elements in SIMD4 float variable.

Parameters

a	SIMD4 variable to reduce/sum.

Returns: The sum of all elements in the argument variable.

static double gmx_simdcall gmx::reduce ( Simd4Double a )

inlinestatic

Return sum of all elements in SIMD4 double variable.

Parameters

a	SIMD4 variable to reduce/sum.

Returns: The sum of all elements in the argument variable.

static float gmx_simdcall gmx::reduce ( SimdFloat a )

inlinestatic

Return sum of all elements in SIMD float variable.

Parameters

a	SIMD variable to reduce/sum.

Returns: The sum of all elements in the argument variable.

static double gmx_simdcall gmx::reduce ( SimdDouble a )

inlinestatic

Return sum of all elements in SIMD double variable.

Parameters

a	SIMD variable to reduce/sum.

Returns: The sum of all elements in the argument variable.

static double gmx_simdcall gmx::reduceIncr4ReturnSum	(	double *	m,
		SimdDouble	v0,
		SimdDouble	v1,
		SimdDouble	v2,
		SimdDouble	v3
	)

inlinestatic

Reduce each of four SIMD doubles, add those values to four consecutive doubles in memory, return sum.

Parameters

m	Pointer to memory where four doubles should be incremented
v0	SIMD variable whose sum should be added to m[0]
v1	SIMD variable whose sum should be added to m[1]
v2	SIMD variable whose sum should be added to m[2]
v3	SIMD variable whose sum should be added to m[3]

Returns: Sum of all elements in the four SIMD variables.

The pointer m must be aligned to the smaller of four elements and the floating-point SIMD width.

Note: This is a special routine intended for the Gromacs nonbonded kernels. It is used in the epilogue of the outer loop, where the variables will contain unrolled forces for one outer-loop-particle each, corresponding to a single coordinate (i.e, say, four x-coordinate force variables). These should be summed and added to the force array in memory. Since we always work with contiguous SIMD-layout , we can use efficient aligned loads/stores. When calculating the virial, we also need the total sum of all forces for each coordinate. This is provided as the return value. For routines that do not need these, this extra code will be optimized away completely if you just ignore the return value (Checked with gcc-4.9.1 and clang-3.6 for AVX).

static float gmx_simdcall gmx::reduceIncr4ReturnSum	(	float *	m,
		SimdFloat	v0,
		SimdFloat	v1,
		SimdFloat	v2,
		SimdFloat	v3
	)

inlinestatic

Reduce each of four SIMD floats, add those values to four consecutive floats in memory, return sum.

Parameters

m	Pointer to memory where four floats should be incremented
v0	SIMD variable whose sum should be added to m[0]
v1	SIMD variable whose sum should be added to m[1]
v2	SIMD variable whose sum should be added to m[2]
v3	SIMD variable whose sum should be added to m[3]

Returns: Sum of all elements in the four SIMD variables.

The pointer m must be aligned to the smaller of four elements and the floating-point SIMD width.

Note: This is a special routine intended for the Gromacs nonbonded kernels. It is used in the epilogue of the outer loop, where the variables will contain unrolled forces for one outer-loop-particle each, corresponding to a single coordinate (i.e, say, four x-coordinate force variables). These should be summed and added to the force array in memory. Since we always work with contiguous SIMD-layout , we can use efficient aligned loads/stores. When calculating the virial, we also need the total sum of all forces for each coordinate. This is provided as the return value. For routines that do not need these, this extra code will be optimized away completely if you just ignore the return value (Checked with gcc-4.9.1 and clang-3.6 for AVX).

static double gmx_simdcall gmx::reduceIncr4ReturnSumHsimd	(	double *	m,
		SimdDouble	v0,
		SimdDouble	v1
	)

inlinestatic

Reduce the 4 half-SIMD-with doubles in 2 SIMD variables (sum halves), increment four consecutive doubles in memory, return sum.

Parameters

m	Pointer to memory where the four values should be incremented
v0	Variable whose half-SIMD sums should be added to m[0]/m[1], respectively.
v1	Variable whose half-SIMD sums should be added to m[2]/m[3], respectively.

Returns: Sum of all elements in the four SIMD variables.

The pointer m must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

Note: This is the half-SIMD-width version of reduceIncr4ReturnSum(). The only difference is that the four half-SIMD inputs needed are present in the low/high halves of the two SIMD arguments.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static float gmx_simdcall gmx::reduceIncr4ReturnSumHsimd	(	float *	m,
		SimdFloat	v0,
		SimdFloat	v1
	)

inlinestatic

Reduce the 4 half-SIMD-with floats in 2 SIMD variables (sum halves), increment four consecutive floats in memory, return sum.

Parameters

m	Pointer to memory where the four values should be incremented
v0	Variable whose half-SIMD sums should be added to m[0]/m[1], respectively.
v1	Variable whose half-SIMD sums should be added to m[2]/m[3], respectively.

Returns: Sum of all elements in the four SIMD variables.

The pointer m must be aligned, but only to the smaller of four elements and the floating-point SIMD width.

Note: This is the half-SIMD-width version of reduceIncr4ReturnSum(). The only difference is that the four half-SIMD inputs needed are present in the low/high halves of the two SIMD arguments.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static Simd4Float gmx_simdcall gmx::round ( Simd4Float a )

inlinestatic

SIMD4 Round to nearest integer value (in floating-point format).

Parameters

a	Any floating-point value

Returns: The nearest integer, represented in floating-point format.

static Simd4Double gmx_simdcall gmx::round ( Simd4Double a )

inlinestatic

SIMD4 Round to nearest integer value (in floating-point format).

Parameters

a	Any floating-point value

Returns: The nearest integer, represented in floating-point format.

static SimdFloat gmx_simdcall gmx::round ( SimdFloat a )

inlinestatic

SIMD float round to nearest integer value (in floating-point format).

Parameters

a	Any floating-point value

Returns: The nearest integer, represented in floating-point format.

Note: Round mode is implementation defined. The only guarantee is that it is consistent between rounding functions (round, cvtR2I).

static SimdDouble gmx_simdcall gmx::round ( SimdDouble a )

inlinestatic

SIMD double round to nearest integer value (in floating-point format).

Parameters

a	Any floating-point value

Returns: The nearest integer, represented in floating-point format.

Note: Round mode is implementation defined. The only guarantee is that it is consistent between rounding functions (round, cvtR2I).

static Simd4Double gmx_simdcall gmx::rsqrt ( Simd4Double x )

inlinestatic

SIMD4 1.0/sqrt(x) lookup.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS.

static Simd4Float gmx_simdcall gmx::rsqrt ( Simd4Float x )

inlinestatic

SIMD4 1.0/sqrt(x) lookup.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS.

static SimdFloat gmx_simdcall gmx::rsqrt ( SimdFloat x )

inlinestatic

SIMD float 1.0/sqrt(x) lookup.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS.

static SimdDouble gmx_simdcall gmx::rsqrt ( SimdDouble x )

inlinestatic

double SIMD 1.0/sqrt(x) lookup.

This is a low-level instruction that should only be called from routines implementing the inverse square root in simd_math.h.

Parameters

x	Argument, x>0

Returns: Approximation of 1/sqrt(x), accuracy is GMX_SIMD_RSQRT_BITS.

static SimdFloat gmx_simdcall gmx::rsqrtIter	(	SimdFloat	lu,
		SimdFloat	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD float.

This is a low-level routine that should only be used by SIMD math routine that evaluates the inverse square root.

Parameters

lu	Approximation of 1/sqrt(x), typically obtained from lookup.
x	The reference (starting) value x for which we want 1/sqrt(x).

Returns: An improved approximation with roughly twice as many bits of accuracy.

static SimdDouble gmx_simdcall gmx::rsqrtIter	(	SimdDouble	lu,
		SimdDouble	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD double.

This is a low-level routine that should only be used by SIMD math routine that evaluates the inverse square root.

Parameters

lu	Approximation of 1/sqrt(x), typically obtained from lookup.
x	The reference (starting) value x for which we want 1/sqrt(x).

Returns: An improved approximation with roughly twice as many bits of accuracy.

static Simd4Float gmx_simdcall gmx::rsqrtIter	(	Simd4Float	lu,
		Simd4Float	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD4 float.

This is a low-level routine that should only be used by SIMD math routine that evaluates the inverse square root.

Parameters

lu	Approximation of 1/sqrt(x), typically obtained from lookup.
x	The reference (starting) value x for which we want 1/sqrt(x).

Returns: An improved approximation with roughly twice as many bits of accuracy.

static Simd4Double gmx_simdcall gmx::rsqrtIter	(	Simd4Double	lu,
		Simd4Double	x
	)

inlinestatic

Perform one Newton-Raphson iteration to improve 1/sqrt(x) for SIMD4 double.

This is a low-level routine that should only be used by SIMD math routine that evaluates the inverse square root.

Parameters

lu	Approximation of 1/sqrt(x), typically obtained from lookup.
x	The reference (starting) value x for which we want 1/sqrt(x).

Returns: An improved approximation with roughly twice as many bits of accuracy.

static Simd4Float gmx_simdcall gmx::selectByMask	(	Simd4Float	a,
		Simd4FBool	mask
	)

inlinestatic

Select from single precision SIMD4 variable where boolean is true.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for true, 0 for false.

static Simd4Double gmx_simdcall gmx::selectByMask	(	Simd4Double	a,
		Simd4DBool	mask
	)

inlinestatic

Select from single precision SIMD4 variable where boolean is true.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for true, 0 for false.

static SimdFloat gmx_simdcall gmx::selectByMask	(	SimdFloat	a,
		SimdFBool	mask
	)

inlinestatic

Select from single precision SIMD variable where boolean is true.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for true, 0 for false.

static SimdDouble gmx_simdcall gmx::selectByMask	(	SimdDouble	a,
		SimdDBool	mask
	)

inlinestatic

Select from double precision SIMD variable where boolean is true.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for true, 0 for false.

static SimdFInt32 gmx_simdcall gmx::selectByMask	(	SimdFInt32	a,
		SimdFIBool	mask
	)

inlinestatic

Select from gmx::SimdFInt32 variable where boolean is true.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer to select from
mask	Boolean selector

Returns: Elements from a where sel is true, 0 otherwise.

static SimdDInt32 gmx_simdcall gmx::selectByMask	(	SimdDInt32	a,
		SimdDIBool	mask
	)

inlinestatic

Select from gmx::SimdDInt32 variable where boolean is true.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer to select from
mask	Boolean selector

Returns: Elements from a where sel is true, 0 otherwise.

static Simd4Float gmx_simdcall gmx::selectByNotMask	(	Simd4Float	a,
		Simd4FBool	mask
	)

inlinestatic

Select from single precision SIMD4 variable where boolean is false.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for false, 0 for true (sic).

static Simd4Double gmx_simdcall gmx::selectByNotMask	(	Simd4Double	a,
		Simd4DBool	mask
	)

inlinestatic

Select from single precision SIMD4 variable where boolean is false.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for false, 0 for true (sic).

static SimdFloat gmx_simdcall gmx::selectByNotMask	(	SimdFloat	a,
		SimdFBool	mask
	)

inlinestatic

Select from single precision SIMD variable where boolean is false.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for false, 0 for true (sic).

static SimdDouble gmx_simdcall gmx::selectByNotMask	(	SimdDouble	a,
		SimdDBool	mask
	)

inlinestatic

Select from double precision SIMD variable where boolean is false.

Parameters

a	Floating-point variable to select from
mask	Boolean selector

Returns: For each element, a is selected for false, 0 for true (sic).

static SimdFInt32 gmx_simdcall gmx::selectByNotMask	(	SimdFInt32	a,
		SimdFIBool	mask
	)

inlinestatic

Select from gmx::SimdFInt32 variable where boolean is false.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer to select from
mask	Boolean selector

Returns: Elements from a where sel is false, 0 otherwise (sic).

static SimdDInt32 gmx_simdcall gmx::selectByNotMask	(	SimdDInt32	a,
		SimdDIBool	mask
	)

inlinestatic

Select from gmx::SimdDInt32 variable where boolean is false.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer to select from
mask	Boolean selector

Returns: Elements from a where sel is false, 0 otherwise (sic).

Simd4Real gmx::test::setSimd4RealFrom1R ( real value )

Set SIMD4 register contents from single real value.

All elements is set from the given value. This is effectively the same operation as simd4Set1(), but is implemented using only load/store operations that have been tested separately in the bootstrapping tests.

Simd4Real gmx::test::setSimd4RealFrom3R	(	real	r0,
		real	r1,
		real	r2
	)

Set SIMD4 register contents from three real values.

It might seem stupid to use three values when we know that the SIMD4 width is 4, but it simplifies the test organization when the SIMD and SIMD4 tests are completely symmetric.

SimdInt32 gmx::test::setSimdIntFrom1I ( int value )

Set SIMD register contents from single integer value.

All elements is set from the given value. This is effectively the same operation as simdSet1I(), but is implemented using only load/store operations that have been tested separately in the bootstrapping tests.

SimdInt32 gmx::test::setSimdIntFrom3I	(	int	i0,
		int	i1,
		int	i2
	)

Set SIMD register contents from three int values.

Our reason for using three values is that 3 is not a factor in any known SIMD width, so this way there will not be any simple repeated patterns e.g. between the low/high 64/128/256 bits in the SIMD register, which could hide bugs.

SimdReal gmx::test::setSimdRealFrom1R ( real value )

Set SIMD register contents from single real value.

All elements is set from the given value. This is effectively the same operation as simdSet1(), but is implemented using only load/store operations that have been tested separately in the bootstrapping tests.

SimdReal gmx::test::setSimdRealFrom3R	(	real	r0,
		real	r1,
		real	r2
	)

Set SIMD register contents from three real values.

Our reason for using three values is that 3 is not a factor in any known SIMD width, so this way there will not be any simple repeated patterns e.g. between the low/high 64/128/256 bits in the SIMD register, which could hide bugs.

static SimdDouble gmx_simdcall gmx::setZeroD ( )

inlinestatic

Set all SIMD double variable elements to 0.0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD 0.0

static SimdDInt32 gmx_simdcall gmx::setZeroDI ( )

inlinestatic

Set all SIMD (double) integer variable elements to 0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD 0

static SimdFloat gmx_simdcall gmx::setZeroF ( )

inlinestatic

Set all SIMD float variable elements to 0.0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD 0.0F

static SimdFInt32 gmx_simdcall gmx::setZeroFI ( )

inlinestatic

Set all SIMD (float) integer variable elements to 0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD 0

std::vector< real > gmx::test::simd4Real2Vector ( Simd4Real simd4 )

Convert SIMD4 real to std::vector<real>.

The returned vector will have the same length as the SIMD4 width.

static Simd4Double gmx_simdcall gmx::simd4SetZeroD ( )

inlinestatic

Set all SIMD4 double elements to 0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD4 0.0

static Simd4Float gmx_simdcall gmx::simd4SetZeroF ( )

inlinestatic

Set all SIMD4 float elements to 0.

You should typically just call gmx::setZero(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Returns: SIMD4 0.0

std::vector< std::int32_t > gmx::test::simdInt2Vector ( SimdInt32 simd )

Convert SIMD integer to std::vector<int>.

The returned vector will have the same length as the SIMD width.

static SimdFloat gmx_simdcall gmx::simdLoad	(	const float *	m,
		SimdFloatTag	= `{}`
	)

inlinestatic

Load GMX_SIMD_FLOAT_WIDTH float numbers from aligned memory.

Parameters

m	Pointer to memory aligned to the SIMD width.

Returns: SIMD variable with data loaded.

static SimdDouble gmx_simdcall gmx::simdLoad	(	const double *	m,
		SimdDoubleTag	= `{}`
	)

inlinestatic

Load GMX_SIMD_DOUBLE_WIDTH numbers from aligned memory.

Parameters

m	Pointer to memory aligned to the SIMD width.

Returns: SIMD variable with data loaded.

static SimdFInt32 gmx_simdcall gmx::simdLoad	(	const std::int32_t *	m,
		SimdFInt32Tag
	)

inlinestatic

Load aligned SIMD integer data, width corresponds to gmx::SimdFloat.

You should typically just call gmx::load(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Parameters

m	Pointer to memory, aligned to (float) integer SIMD width.

Returns: SIMD integer variable.

static SimdDInt32 gmx_simdcall gmx::simdLoad	(	const std::int32_t *	m,
		SimdDInt32Tag
	)

inlinestatic

Load aligned SIMD integer data, width corresponds to gmx::SimdDouble.

You should typically just call gmx::load(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Parameters

m	Pointer to memory, aligned to (double) integer SIMD width.

Returns: SIMD integer variable.

static SimdFloat gmx_simdcall gmx::simdLoadU	(	const float *	m,
		SimdFloatTag	= `{}`
	)

inlinestatic

Load SIMD float from unaligned memory.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirement.

Returns: SIMD variable with data loaded.

static SimdDouble gmx_simdcall gmx::simdLoadU	(	const double *	m,
		SimdDoubleTag	= `{}`
	)

inlinestatic

Load SIMD double from unaligned memory.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirement.

Returns: SIMD variable with data loaded.

static SimdFInt32 gmx_simdcall gmx::simdLoadU	(	const std::int32_t *	m,
		SimdFInt32Tag
	)

inlinestatic

Load unaligned integer SIMD data, width corresponds to gmx::SimdFloat.

You should typically just call gmx::loadU(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirements.

Returns: SIMD integer variable.

static SimdDInt32 gmx_simdcall gmx::simdLoadU	(	const std::int32_t *	m,
		SimdDInt32Tag
	)

inlinestatic

Load unaligned integer SIMD data, width corresponds to gmx::SimdDouble.

You should typically just call gmx::loadU(), which uses proxy objects internally to handle all types rather than adding the suffix used here.

Available if GMX_SIMD_HAVE_LOADU is 1.

Parameters

m	Pointer to memory, no alignment requirements.

Returns: SIMD integer variable.

std::vector< real > gmx::test::simdReal2Vector ( SimdReal simd )

Convert SIMD real to std::vector<real>.

The returned vector will have the same length as the SIMD width.

static SimdFloat gmx_simdcall gmx::sin ( SimdFloat x )

inlinestatic

SIMD float sin(x).

Parameters

x	The argument to evaluate sin for

Returns: Sin(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static SimdDouble gmx_simdcall gmx::sin ( SimdDouble x )

inlinestatic

SIMD double sin(x).

Parameters

x	The argument to evaluate sin for

Returns: Sin(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static void gmx_simdcall gmx::sincos	(	SimdFloat	x,
		SimdFloat *	sinval,
		SimdFloat *	cosval
	)

inlinestatic

SIMD float sin & cos.

Parameters

	x	The argument to evaluate sin/cos for
[out]	sinval	Sin(x)
[out]	cosval	Cos(x)

This version achieves close to machine precision, but for very large magnitudes of the argument we inherently begin to lose accuracy due to the argument reduction, despite using extended precision arithmetics internally.

static void gmx_simdcall gmx::sincos	(	SimdDouble	x,
		SimdDouble *	sinval,
		SimdDouble *	cosval
	)

inlinestatic

SIMD double sin & cos.

Parameters

	x	The argument to evaluate sin/cos for
[out]	sinval	Sin(x)
[out]	cosval	Cos(x)

This version achieves close to machine precision, but for very large magnitudes of the argument we inherently begin to lose accuracy due to the argument reduction, despite using extended precision arithmetics internally.

static void gmx_simdcall gmx::sinCosSingleAccuracy	(	SimdDouble	x,
		SimdDouble *	sinval,
		SimdDouble *	cosval
	)

inlinestatic

SIMD sin & cos. Double precision SIMD data, single accuracy.

Parameters

	x	The argument to evaluate sin/cos for
[out]	sinval	Sin(x)
[out]	cosval	Cos(x)

static void gmx_simdcall gmx::sinCosSingleAccuracy	(	SimdFloat	x,
		SimdFloat *	sinval,
		SimdFloat *	cosval
	)

inlinestatic

SIMD float sin & cos, only targeting single accuracy.

Parameters

	x	The argument to evaluate sin/cos for
[out]	sinval	Sin(x)
[out]	cosval	Cos(x)

static SimdDouble gmx_simdcall gmx::sinSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD sin(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate sin for

Returns: Sin(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

static SimdFloat gmx_simdcall gmx::sinSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float sin(x), only targeting single accuracy.

Parameters

x	The argument to evaluate sin for

Returns: Sin(x)

Attention: Do NOT call both sin & cos if you need both results, since each of them will then call sincos and waste a factor 2 in performance.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::sqrt ( SimdFloat x )

inlinestatic

Calculate sqrt(x) for SIMD floats.

Template Parameters

opt	By default, this function checks if the input value is 0.0 and masks this to return the correct result. If you are certain your argument will never be zero, and you know you need to save every single cycle you can, you can alternatively call the function as sqrt<MathOptimization::Unsafe>(x).

Parameters

x Argument that must be in range 0 <=x <= GMX_FLOAT_MAX, since the lookup step often has to be implemented in single precision. Arguments smaller than GMX_FLOAT_MIN will always lead to a zero result, even in double precision. If you are using the unsafe math optimization parameter, the argument must be in the range GMX_FLOAT_MIN <= x <= GMX_FLOAT_MAX.

Returns: sqrt(x). The result is undefined if the input value does not fall in the allowed range specified for the argument.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::sqrt ( SimdDouble x )

inlinestatic

Calculate sqrt(x) for SIMD doubles.

Template Parameters

opt	By default, this function checks if the input value is 0.0 and masks this to return the correct result. If you are certain your argument will never be zero, and you know you need to save every single cycle you can, you can alternatively call the function as sqrt<MathOptimization::Unsafe>(x).

Parameters

x Argument that must be in range 0 <=x <= GMX_FLOAT_MAX, since the lookup step often has to be implemented in single precision. Arguments smaller than GMX_FLOAT_MIN will always lead to a zero result, even in double precision. If you are using the unsafe math optimization parameter, the argument must be in the range GMX_FLOAT_MIN <= x <= GMX_FLOAT_MAX.

Returns: sqrt(x). The result is undefined if the input value does not fall in the allowed range specified for the argument.

template<MathOptimization opt = MathOptimization::Safe>

static SimdDouble gmx_simdcall gmx::sqrtSingleAccuracy ( SimdDouble x )

inlinestatic

Calculate sqrt(x) (correct for 0.0) for SIMD double, with single accuracy.

Template Parameters

opt	By default, this function checks if the input value is 0.0 and masks this to return the correct result. If you are certain your argument will never be zero, and you know you need to save every single cycle you can, you can alternatively call the function as sqrt<MathOptimization::Unsafe>(x).

Parameters

x Argument that must be in range 0 <=x <= GMX_FLOAT_MAX, since the lookup step often has to be implemented in single precision. Arguments smaller than GMX_FLOAT_MIN will always lead to a zero result, even in double precision. If you are using the unsafe math optimization parameter, the argument must be in the range GMX_FLOAT_MIN <= x <= GMX_FLOAT_MAX.

Returns: sqrt(x). The result is undefined if the input value does not fall in the allowed range specified for the argument.

template<MathOptimization opt = MathOptimization::Safe>

static SimdFloat gmx_simdcall gmx::sqrtSingleAccuracy ( SimdFloat x )

inlinestatic

Calculate sqrt(x) for SIMD float, always targeting single accuracy.

Template Parameters

opt	By default, this function checks if the input value is 0.0 and masks this to return the correct result. If you are certain your argument will never be zero, and you know you need to save every single cycle you can, you can alternatively call the function as sqrt<MathOptimization::Unsafe>(x).

Parameters

x Argument that must be in range 0 <=x <= GMX_FLOAT_MAX, since the lookup step often has to be implemented in single precision. Arguments smaller than GMX_FLOAT_MIN will always lead to a zero result, even in double precision. If you are using the unsafe math optimization parameter, the argument must be in the range GMX_FLOAT_MIN <= x <= GMX_FLOAT_MAX.

Returns: sqrt(x). The result is undefined if the input value does not fall in the allowed range specified for the argument.

static void gmx_simdcall gmx::store	(	float *	m,
		SimdFloat	a
	)

inlinestatic

Store the contents of SIMD float variable to aligned memory m.

Parameters

[out]	m	Pointer to memory, aligned to SIMD width.
	a	SIMD variable to store

static void gmx_simdcall gmx::store	(	double *	m,
		SimdDouble	a
	)

inlinestatic

Store the contents of SIMD double variable to aligned memory m.

Parameters

[out]	m	Pointer to memory, aligned to SIMD width.
	a	SIMD variable to store

Examples:: template.cpp.

static void gmx_simdcall gmx::store	(	std::int32_t *	m,
		SimdFInt32	a
	)

inlinestatic

Store aligned SIMD integer data, width corresponds to gmx::SimdFloat.

Parameters

m	Memory aligned to (float) integer SIMD width.
a	SIMD variable to store.

static void gmx_simdcall gmx::store	(	std::int32_t *	m,
		SimdDInt32	a
	)

inlinestatic

Store aligned SIMD integer data, width corresponds to gmx::SimdDouble.

Parameters

m	Memory aligned to (double) integer SIMD width.
a	SIMD (double) integer variable to store.

static void gmx_simdcall gmx::store4	(	float *	m,
		Simd4Float	a
	)

inlinestatic

Store the contents of SIMD4 float to aligned memory m.

Parameters

[out]	m	Pointer to memory, aligned to 4 elements.
	a	SIMD4 variable to store

static void gmx_simdcall gmx::store4	(	double *	m,
		Simd4Double	a
	)

inlinestatic

Store the contents of SIMD4 double to aligned memory m.

Parameters

[out]	m	Pointer to memory, aligned to 4 elements.
	a	SIMD4 variable to store

static void gmx_simdcall gmx::store4U	(	double *	m,
		Simd4Double	a
	)

inlinestatic

Store SIMD4 double to unaligned memory.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

[out]	m	Pointer to memory, no alignment requirement.
	a	SIMD4 variable to store.

static void gmx_simdcall gmx::store4U	(	float *	m,
		Simd4Float	a
	)

inlinestatic

Store SIMD4 float to unaligned memory.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

[out]	m	Pointer to memory, no alignment requirement.
	a	SIMD4 variable to store.

static void gmx_simdcall gmx::storeDualHsimd	(	double *	m0,
		double *	m1,
		SimdDouble	a
	)

inlinestatic

Store low & high parts of SIMD double to different locations.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.
a	SIMD variable. Low half should be stored to m0, high to m1.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_DOUBLE is 1.

static void gmx_simdcall gmx::storeDualHsimd	(	float *	m0,
		float *	m1,
		SimdFloat	a
	)

inlinestatic

Store low & high parts of SIMD float to different locations.

Parameters

m0	Pointer to memory aligned to half SIMD width.
m1	Pointer to memory aligned to half SIMD width.
a	SIMD variable. Low half should be stored to m0, high to m1.

Available if GMX_SIMD_HAVE_HSIMD_UTIL_FLOAT is 1.

static void gmx_simdcall gmx::storeU	(	float *	m,
		SimdFloat	a
	)

inlinestatic

Store SIMD float to unaligned memory.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

[out]	m	Pointer to memory, no alignment requirement.
	a	SIMD variable to store.

static void gmx_simdcall gmx::storeU	(	double *	m,
		SimdDouble	a
	)

inlinestatic

Store SIMD double to unaligned memory.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

[out]	m	Pointer to memory, no alignment requirement.
	a	SIMD variable to store.

static void gmx_simdcall gmx::storeU	(	std::int32_t *	m,
		SimdFInt32	a
	)

inlinestatic

Store unaligned SIMD integer data, width corresponds to gmx::SimdFloat.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

m	Memory pointer, no alignment requirements.
a	SIMD variable to store.

static void gmx_simdcall gmx::storeU	(	std::int32_t *	m,
		SimdDInt32	a
	)

inlinestatic

Store unaligned SIMD integer data, width corresponds to gmx::SimdDouble.

Available if GMX_SIMD_HAVE_STOREU is 1.

Parameters

m	Memory pointer, no alignment requirements.
a	SIMD (double) integer variable to store.

static SimdFloat gmx_simdcall gmx::tan ( SimdFloat x )

inlinestatic

SIMD float tan(x).

Parameters

x	The argument to evaluate tan for

Returns: Tan(x)

static SimdDouble gmx_simdcall gmx::tan ( SimdDouble x )

inlinestatic

SIMD double tan(x).

Parameters

x	The argument to evaluate tan for

Returns: Tan(x)

static SimdDouble gmx_simdcall gmx::tanSingleAccuracy ( SimdDouble x )

inlinestatic

SIMD tan(x). Double precision SIMD data, single accuracy.

Parameters

x	The argument to evaluate tan for

Returns: Tan(x)

static SimdFloat gmx_simdcall gmx::tanSingleAccuracy ( SimdFloat x )

inlinestatic

SIMD float tan(x), only targeting single accuracy.

Parameters

x	The argument to evaluate tan for

Returns: Tan(x)

static SimdFBool gmx_simdcall gmx::testBits ( SimdFloat a )

inlinestatic

Return true if any bits are set in the single precision SIMD.

This function is used to handle bitmasks, mainly for exclusions in the inner kernels. Note that it will return true even for -0.0F (sign bit set), so it is not identical to not-equal.

Parameters

a value

Returns: Each element of the boolean will be true if any bit in a is nonzero.

static SimdDBool gmx_simdcall gmx::testBits ( SimdDouble a )

inlinestatic

Return true if any bits are set in the single precision SIMD.

This function is used to handle bitmasks, mainly for exclusions in the inner kernels. Note that it will return true even for -0.0 (sign bit set), so it is not identical to not-equal.

Parameters

a value

Returns: Each element of the boolean will be true if any bit in a is nonzero.

static SimdFIBool gmx_simdcall gmx::testBits ( SimdFInt32 a )

inlinestatic

Check if any bit is set in each element.

Available if GMX_SIMD_HAVE_FINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer

Returns: SIMD integer boolean with true for elements where any bit is set

static SimdDIBool gmx_simdcall gmx::testBits ( SimdDInt32 a )

inlinestatic

Check if any bit is set in each element.

Available if GMX_SIMD_HAVE_DINT32_ARITHMETICS is 1.

Parameters

a	SIMD integer

Returns: SIMD integer boolean with true for elements where any bit is set

static void gmx_simdcall gmx::transpose	(	Simd4Float *	v0,
		Simd4Float *	v1,
		Simd4Float *	v2,
		Simd4Float *	v3
	)

inlinestatic

SIMD4 float transpose.

Parameters

[in,out]	v0	Row 0 on input, column 0 on output
[in,out]	v1	Row 1 on input, column 1 on output
[in,out]	v2	Row 2 on input, column 2 on output
[in,out]	v3	Row 3 on input, column 3 on output

static void gmx_simdcall gmx::transpose	(	Simd4Double *	v0,
		Simd4Double *	v1,
		Simd4Double *	v2,
		Simd4Double *	v3
	)

inlinestatic

SIMD4 double transpose.

Parameters

[in,out]	v0	Row 0 on input, column 0 on output
[in,out]	v1	Row 1 on input, column 1 on output
[in,out]	v2	Row 2 on input, column 2 on output
[in,out]	v3	Row 3 on input, column 3 on output

template<int align>

static void gmx_simdcall gmx::transposeScatterDecrU	(	double *	base,
		const std::int32_t	offset[],
		SimdDouble	v0,
		SimdDouble	v1,
		SimdDouble	v2
	)

inlinestatic

Transpose and subtract 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are decremented.

Parameters

[out]	base	Pointer to start of memory.
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component, subtracted from base[align*offset[i]]
	v1	2nd component, subtracted from base[align*offset[i]+1]
	v2	3rd component, subtracted from base[align*offset[i]+2]

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load/store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned load/store, and subtract 0.0 from the extra elements. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

template<int align>

static void gmx_simdcall gmx::transposeScatterDecrU	(	float *	base,
		const std::int32_t	offset[],
		SimdFloat	v0,
		SimdFloat	v1,
		SimdFloat	v2
	)

inlinestatic

Transpose and subtract 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are decremented.

Parameters

[out]	base	Pointer to start of memory.
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component, subtracted from base[align*offset[i]]
	v1	2nd component, subtracted from base[align*offset[i]+1]
	v2	3rd component, subtracted from base[align*offset[i]+2]

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load/store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned load/store, and subtract 0.0 from the extra elements. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

template<int align>

static void gmx_simdcall gmx::transposeScatterIncrU	(	double *	base,
		const std::int32_t	offset[],
		SimdDouble	v0,
		SimdDouble	v1,
		SimdDouble	v2
	)

inlinestatic

Transpose and add 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are incremented.

Parameters

[out]	base	Pointer to the start of the memory area
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component of triplets, added to base[align*offset[i]].
	v1	2nd component of triplets, added to base[align*offset[i] + 1].
	v2	3rd component of triplets, added to base[align*offset[i] + 2].

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load/store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned load/store, and add 0.0 to the extra elements. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

template<int align>

static void gmx_simdcall gmx::transposeScatterIncrU	(	float *	base,
		const std::int32_t	offset[],
		SimdFloat	v0,
		SimdFloat	v1,
		SimdFloat	v2
	)

inlinestatic

Transpose and add 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are incremented.

Parameters

[out]	base	Pointer to the start of the memory area
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component of triplets, added to base[align*offset[i]].
	v1	2nd component of triplets, added to base[align*offset[i] + 1].
	v2	3rd component of triplets, added to base[align*offset[i] + 2].

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory load/store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.; To improve performance, this function might use full-SIMD-width unaligned load/store, and add 0.0 to the extra elements. This means you need to ensure the memory is padded at the end, so we always can load GMX_SIMD_REAL_WIDTH elements starting at the last offset. If you use the Gromacs aligned memory allocation routines this will always be the case.

template<int align>

static void gmx_simdcall gmx::transposeScatterStoreU	(	double *	base,
		const std::int32_t	offset[],
		SimdDouble	v0,
		SimdDouble	v1,
		SimdDouble	v2
	)

inlinestatic

Transpose and store 3 SIMD doubles to 3 consecutive addresses at GMX_SIMD_DOUBLE_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are written.

Parameters

[out]	base	Pointer to the start of the memory area
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component of triplets, written to base[align*offset[i]].
	v1	2nd component of triplets, written to base[align*offset[i] + 1].
	v2	3rd component of triplets, written to base[align*offset[i] + 2].

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.

template<int align>

static void gmx_simdcall gmx::transposeScatterStoreU	(	float *	base,
		const std::int32_t	offset[],
		SimdFloat	v0,
		SimdFloat	v1,
		SimdFloat	v2
	)

inlinestatic

Transpose and store 3 SIMD floats to 3 consecutive addresses at GMX_SIMD_FLOAT_WIDTH offsets.

Template Parameters

align Alignment of the memory to which we write, i.e. distance (measured in elements, not bytes) between index points. When this is identical to the number of SIMD variables (i.e., 3 for this routine) the output data is packed without padding in memory. See the SIMD parameters for exactly what memory positions are written.

Parameters

[out]	base	Pointer to the start of the memory area
	offset	Aligned array with offsets to the start of each triplet.
	v0	1st component of triplets, written to base[align*offset[i]].
	v1	2nd component of triplets, written to base[align*offset[i] + 1].
	v2	3rd component of triplets, written to base[align*offset[i] + 2].

This function can work with both aligned (better performance) and unaligned memory. When the align parameter is not a power-of-two (align==3 would be normal for packed atomic coordinates) the memory obviously cannot be aligned, and we account for this. However, in the case where align is a power-of-two, we assume the base pointer also has the same alignment, which will enable many platforms to use faster aligned memory store operations. An easy way to think of this is that each triplet of data in memory must be aligned to the align parameter you specify when it's a power-of-two.

The offset memory must always be aligned to GMX_SIMD_FINT32_WIDTH, since this enables us to use SIMD loads and gather operations on platforms that support it.

Note: You should NOT scale offsets before calling this routine; it is done internally by using the alignment template parameter instead.; This routine uses a normal array for the offsets, since we typically load the data from memory. On the architectures we have tested this is faster even when a SIMD integer datatype is present.

static Simd4Float gmx_simdcall gmx::trunc ( Simd4Float a )

inlinestatic

Truncate SIMD4, i.e. round towards zero - common hardware instruction.

Parameters

a	Any floating-point value

Returns: Integer rounded towards zero, represented in floating-point format.

Note: This is truncation towards zero, not floor(). The reason for this is that truncation is virtually always present as a dedicated hardware instruction, but floor() frequently isn't.

static Simd4Double gmx_simdcall gmx::trunc ( Simd4Double a )

inlinestatic

Truncate SIMD4, i.e. round towards zero - common hardware instruction.

Parameters

a	Any floating-point value

Returns: Integer rounded towards zero, represented in floating-point format.

Note: This is truncation towards zero, not floor(). The reason for this is that truncation is virtually always present as a dedicated hardware instruction, but floor() frequently isn't.

static SimdFloat gmx_simdcall gmx::trunc ( SimdFloat a )

inlinestatic

Truncate SIMD float, i.e. round towards zero - common hardware instruction.

Parameters

a	Any floating-point value

Returns: Integer rounded towards zero, represented in floating-point format.

Note: This is truncation towards zero, not floor(). The reason for this is that truncation is virtually always present as a dedicated hardware instruction, but floor() frequently isn't.

static SimdDouble gmx_simdcall gmx::trunc ( SimdDouble a )

inlinestatic

Truncate SIMD double, i.e. round towards zero - common hardware instruction.

Parameters

a	Any floating-point value

Returns: Integer rounded towards zero, represented in floating-point format.

Note: This is truncation towards zero, not floor(). The reason for this is that truncation is virtually always present as a dedicated hardware instruction, but floor() frequently isn't.

Simd4Real gmx::test::vector2Simd4Real ( const std::vector< real > & v )

Return floating-point SIMD4 value from std::vector<real>.

If the vector is longer than SIMD4 width, only the first elements will be used. If it is shorter, the contents will be repeated to fill the SIMD4 register.

SimdInt32 gmx::test::vector2SimdInt ( const std::vector< std::int32_t > & v )

Return 32-bit integer SIMD value from std::vector<int>.

If the vector is longer than SIMD width, only the first elements will be used. If it is shorter, the contents will be repeated to fill the SIMD register.

SimdReal gmx::test::vector2SimdReal ( const std::vector< real > & v )

Return floating-point SIMD value from std::vector<real>.

If the vector is longer than SIMD width, only the first elements will be used. If it is shorter, the contents will be repeated to fill the SIMD register.

Variable Documentation

const int gmx::c_simdBestPairAlignmentDouble = 2

static

Best alignment to use for aligned pairs of double data.

The routines to load and transpose data will work with a wide range of alignments, but some might be faster than others, depending on the load instructions available in the hardware. This specifies the best alignment for each implementation when working with pairs of data.

To allow each architecture to use the most optimal form, we use a constant that code outside the SIMD module should use to store things properly. It must be at least 2. For example, a value of 2 means the two parameters A & B are stored as [A0 B0 A1 B1] while align-4 means [A0 B0 - - A1 B1 - -].

This alignment depends on the efficiency of partial-register load/store operations, and will depend on the architecture.

const int gmx::c_simdBestPairAlignmentFloat = 2

static

Best alignment to use for aligned pairs of float data.

The routines to load and transpose data will work with a wide range of alignments, but some might be faster than others, depending on the load instructions available in the hardware. This specifies the best alignment for each implementation when working with pairs of data.

To allow each architecture to use the most optimal form, we use a constant that code outside the SIMD module should use to store things properly. It must be at least 2. For example, a value of 2 means the two parameters A & B are stored as [A0 B0 A1 B1] while align-4 means [A0 B0 - - A1 B1 - -].

This alignment depends on the efficiency of partial-register load/store operations, and will depend on the architecture.

const Simd4Real gmx::test::rSimd4_m3p75 = setSimd4RealFrom1R(-3.75)

Negative value that rounds down.

const SimdReal gmx::test::rSimd_Exp

Initial value:

= setSimdRealFrom3R(1.4055235171027452623914516e+18,
                                             5.3057102734253445623914516e-13,
                                             -2.1057102745623934534514516e+16)

Three large floating-point values whose exponents are >32.

const SimdReal gmx::test::rSimd_m3p75 = setSimdRealFrom1R(-3.75)

Negative value that rounds down.

int gmx::test::SimdBaseTest::s_nPoints = 10000

static

Number of test points to use, settable on command line.

Note: While this has to be a static non-const variable for the command-line option to work, you should never change it manually in any of the tests, because the static storage class will make the value apply to all subsequent tests unless you remember to reset it.

Description

Namespaces

Constant width-4 double precision SIMD types and instructions

Constant width-4 single precision SIMD types and instructions

SIMD implementation load/store operations for double precision floating point

SIMD implementation load/store operations for integers (corresponding to double)

SIMD implementation double precision floating-point bitwise logical operations

SIMD implementation double precision floating-point arithmetics

SIMD implementation double precision floating-point comparison, boolean, selection.

SIMD implementation integer (corresponding to double) bitwise logical operations

SIMD implementation integer (corresponding to double) arithmetics

SIMD implementation integer (corresponding to double) comparisons, boolean selection

SIMD implementation conversion operations

SIMD implementation load/store operations for single precision floating point

SIMD implementation load/store operations for integers (corresponding to float)

SIMD implementation single precision floating-point bitwise logical operations

SIMD implementation single precision floating-point arithmetics

SIMD implementation single precision floating-point comparisons, boolean, selection.

SIMD implementation integer (corresponding to float) bitwise logical operations

SIMD implementation integer (corresponding to float) arithmetics

SIMD implementation integer (corresponding to float) comparisons, boolean, selection

Higher-level SIMD utility functions, double precision.

Higher-level SIMD utilities accessing partial (half-width) SIMD doubles.

Higher-level SIMD utility functions, single precision.

Higher-level SIMD utilities accessing partial (half-width) SIMD floats.

SIMD predefined macros to describe high-level capabilities

Single precision SIMD math functions

Double precision SIMD math functions

SIMD math functions for double prec. data, single prec. accuracy

SIMD4 math functions

Classes

Macros

Typedefs

Functions

Variables

Directories

Files

Macro Definition Documentation

Typedef Documentation

Function Documentation

Variable Documentation