METAINFERENCE
This is part of the isdb module

Calculates the Metainference energy for a set of experimental data.

Metainference [21] is a Bayesian framework to model heterogeneous systems by integrating prior information with noisy, ensemble-averaged data. Metainference models a system and quantifies the level of noise in the data by considering a set of replicas of the system.

Calculated experimental data are given in input as ARG while reference experimental values can be given either from fixed components of other actions using PARARG or as numbers using PARAMETERS. The default behavior is that of averaging the data over the available replicas, if this is not wanted the keyword NOENSEMBLE prevent this averaging.

Metadynamics Metainference [22] or more in general biased Metainference requires the knowledge of biasing potential in order to calculate the weighted average. In this case the value of the bias can be provided as the last argument in ARG and adding the keyword REWEIGHT. To avoid the noise resulting from the instantaneous value of the bias the weight of each replica can be averaged over a give time using the keyword AVERAGING.

The data can be averaged by using multiple replicas and weighted for a bias if present. The functional form of Metainference can be chosen among four variants selected with NOISE=GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC which correspond to modelling the noise for the arguments as a single gaussian common to all the data points, a gaussian per data point, a single long-tailed gaussian common to all the data points, a log-tailed gaussian per data point or using two distinct noises as for the most general formulation of Metainference. In this latter case the noise of the replica-averaging is gaussian (one per data point) and the noise for the comparison with the experimental data can chosen using the keyword LIKELIHOOD between gaussian or log-normal (one per data point), furthermore the evolution of the estimated average over an infinite number of replicas is driven by DFTILDE.

As for Metainference theory there are two sigma values: SIGMA_MEAN0 represent the error of calculating an average quantity using a finite set of replica and should be set as small as possible following the guidelines for replica-averaged simulations in the framework of the Maximum Entropy Principle. Alternatively, this can be obtained automatically using the internal sigma mean optimization as introduced in [70] (OPTSIGMAMEAN=SEM), in this second case sigma_mean is estimated from the maximum standard error of the mean either over the simulation or over a defined time using the keyword AVERAGING. SIGMA_BIAS is an uncertainty parameter, sampled by a MC algorithm in the bounded interval defined by SIGMA_MIN and SIGMA_MAX. The initial value is set at SIGMA0. The MC move is a random displacement of maximum value equal to DSIGMA. If the number of data point is too large and the acceptance rate drops it is possible to make the MC move over mutually exclusive, random subset of size MC_CHUNKSIZE and run more than one move setting MC_STEPS in such a way that MC_CHUNKSIZE*MC_STEPS will cover all the data points.

Calculated and experimental data can be compared modulo a scaling factor and/or an offset using SCALEDATA and/or ADDOFFSET, the sampling is obtained by a MC algorithm either using a flat or a gaussian prior setting it with SCALE_PRIOR or OFFSET_PRIOR.

Examples

In the following example we calculate a set of RDC, take the replica-average of them and comparing them with a set of experimental values. RDCs are compared with the experimental data but for a multiplication factor SCALE that is also sampled by MC on-the-fly

Click on the labels of the actions for more information on what each action computes
tested on master
rdc: RDC ...
   
   
SCALE
compulsory keyword ( default=1. ) Add the scaling factor to take into account concentration and other effects.
=0.0001
GYROM
compulsory keyword ( default=1. ) Add the product of the gyromagnetic constants for the bond.
=-72.5388
ATOMS1
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=22,23
ATOMS2
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=25,27
ATOMS3
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=29,31
ATOMS4
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=33,34 ... spe: METAINFERENCE ...
ARG
the input for this action is the scalar output from one or more other actions.
=rdc.*
NOISETYPE
compulsory keyword ( default=MGAUSS ) functional form of the noise (GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC)
=MGAUSS
PARAMETERS
reference values for the experimental data
=1.9190,2.9190,3.9190,4.9190
SCALEDATA
( default=off ) Set to TRUE if you want to sample a scaling factor common to all values and replicas
SCALE0
could not find this keyword
=1
SCALE_MIN
minimum value of the scaling factor
=0.1
SCALE_MAX
maximum value of the scaling factor
=3
DSCALE
maximum MC move of the scaling factor
=0.01
SIGMA0
could not find this keyword
=0.01
SIGMA_MIN
compulsory keyword ( default=0.0 ) minimum value of the uncertainty parameter
=0.00001
SIGMA_MAX
compulsory keyword ( default=10. ) maximum value of the uncertainty parameter
=3
DSIGMA
maximum MC move of the uncertainty parameter
=0.01
SIGMA_MEAN0
could not find this keyword
=0.001 ...

in the following example instead of using one uncertainty parameter per data point we use a single uncertainty value in a long-tailed gaussian to take into account for outliers, furthermore the data are weighted for the bias applied to other variables of the system.

Click on the labels of the actions for more information on what each action computes
tested on master
rdc: RDC ...
   
   
SCALE
compulsory keyword ( default=1. ) Add the scaling factor to take into account concentration and other effects.
=0.0001
GYROM
compulsory keyword ( default=1. ) Add the product of the gyromagnetic constants for the bond.
=-72.5388
ATOMS1
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=22,23
ATOMS2
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=25,27
ATOMS3
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=29,31
ATOMS4
the couple of atoms involved in each of the bonds for which you wish to calculate the RDC.
=33,34 ... cv1: TORSION
ATOMS
the four atoms involved in the torsional angle
=1,2,3,4 cv2: TORSION
ATOMS
the four atoms involved in the torsional angle
=2,3,4,5 mm: METAD
ARG
the input for this action is the scalar output from one or more other actions.
=cv1,cv2
HEIGHT
the heights of the Gaussian hills.
=0.5
SIGMA
compulsory keyword the widths of the Gaussian hills
=0.3,0.3
PACE
compulsory keyword the frequency for hill addition
=200
BIASFACTOR
use well tempered metadynamics and use this bias factor.
=8
WALKERS_MPI
( default=off ) Switch on MPI version of multiple walkers - not compatible with WALKERS_* options other than WALKERS_DIR
spe: METAINFERENCE ... #SETTINGS NREPLICAS=2
ARG
the input for this action is the scalar output from one or more other actions.
=rdc.*,mm.bias
REWEIGHT
( default=off ) simple REWEIGHT using the latest ARG as energy
NOISETYPE
compulsory keyword ( default=MGAUSS ) functional form of the noise (GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC)
=OUTLIERS
PARAMETERS
reference values for the experimental data
=1.9190,2.9190,3.9190,4.9190
SCALEDATA
( default=off ) Set to TRUE if you want to sample a scaling factor common to all values and replicas
SCALE0
could not find this keyword
=1
SCALE_MIN
minimum value of the scaling factor
=0.1
SCALE_MAX
maximum value of the scaling factor
=3
DSCALE
maximum MC move of the scaling factor
=0.01
SIGMA0
could not find this keyword
=0.01
SIGMA_MIN
compulsory keyword ( default=0.0 ) minimum value of the uncertainty parameter
=0.00001
SIGMA_MAX
compulsory keyword ( default=10. ) maximum value of the uncertainty parameter
=3
DSIGMA
maximum MC move of the uncertainty parameter
=0.01
SIGMA_MEAN0
could not find this keyword
=0.001 ...

(See also RDC, PBMETAD).

Glossary of keywords and components
Description of components

By default this Action calculates the following quantities. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.

Quantity Description
bias the instantaneous value of the bias potential
sigma uncertainty parameter
sigmaMean uncertainty in the mean estimate
neff effective number of replicas
acceptSigma MC acceptance for sigma values

In addition the following quantities can be calculated by employing the keywords listed below

Quantity Keyword Description
acceptScale SCALEDATA MC acceptance for scale value
acceptFT GENERIC MC acceptance for general metainference f tilde value
weight REWEIGHT weights of the weighted average
biasDer REWEIGHT derivatives with respect to the bias
scale SCALEDATA scale parameter
offset ADDOFFSET offset parameter
ftilde GENERIC ensemble average estimator
Compulsory keywords
NOISETYPE ( default=MGAUSS ) functional form of the noise (GAUSS,MGAUSS,OUTLIERS,MOUTLIERS,GENERIC)
LIKELIHOOD ( default=GAUSS ) the likelihood for the GENERIC metainference model, GAUSS or LOGN
DFTILDE ( default=0.1 ) fraction of sigma_mean used to evolve ftilde
SCALE0 ( default=1.0 ) initial value of the scaling factor
SCALE_PRIOR ( default=FLAT ) either FLAT or GAUSSIAN
OFFSET0 ( default=0.0 ) initial value of the offset
OFFSET_PRIOR ( default=FLAT ) either FLAT or GAUSSIAN
SIGMA0 ( default=1.0 ) initial value of the uncertainty parameter
SIGMA_MIN ( default=0.0 ) minimum value of the uncertainty parameter
SIGMA_MAX ( default=10. ) maximum value of the uncertainty parameter
OPTSIGMAMEAN ( default=NONE ) Set to NONE/SEM to manually set sigma mean, or to estimate it on the fly
WRITE_STRIDE ( default=10000 ) write the status to a file every N steps, this can be used for restart/continuation
Options
NUMERICAL_DERIVATIVES ( default=off ) calculate the derivatives for these quantities numerically
NOENSEMBLE ( default=off ) don't perform any replica-averaging
REWEIGHT ( default=off ) simple REWEIGHT using the latest ARG as energy
SCALEDATA ( default=off ) Set to TRUE if you want to sample a scaling factor common to all values and replicas
ADDOFFSET

( default=off ) Set to TRUE if you want to sample an offset common to all values and replicas

ARG the input for this action is the scalar output from one or more other actions. The particular scalars that you will use are referenced using the label of the action. If the label appears on its own then it is assumed that the Action calculates a single scalar value. The value of this scalar is thus used as the input to this new action. If * or *.* appears the scalars calculated by all the proceeding actions in the input file are taken. Some actions have multi-component outputs and each component of the output has a specific label. For example a DISTANCE action labelled dist may have three components x, y and z. To take just the x component you should use dist.x, if you wish to take all three components then use dist.*.More information on the referencing of Actions can be found in the section of the manual on the PLUMED Getting Started. Scalar values can also be referenced using POSIX regular expressions as detailed in the section on Regular Expressions. To use this feature you you must compile PLUMED with the appropriate flag.. You can use multiple instances of this keyword i.e. ARG1, ARG2, ARG3...
PARARG reference values for the experimental data, these can be provided as arguments without derivatives
PARAMETERS reference values for the experimental data
AVERAGING Stride for calculation of averaged weights and sigma_mean
SCALE_MIN minimum value of the scaling factor
SCALE_MAX maximum value of the scaling factor
DSCALE maximum MC move of the scaling factor
OFFSET_MIN minimum value of the offset
OFFSET_MAX maximum value of the offset
DOFFSET maximum MC move of the offset
REGRES_ZERO stride for regression with zero offset
DSIGMA maximum MC move of the uncertainty parameter
SIGMA_MEAN0 starting value for the uncertainty in the mean estimate
SIGMA_MAX_STEPS Number of steps used to optimise SIGMA_MAX, before that the SIGMA_MAX value is used
TEMP the system temperature - this is only needed if code doesn't pass the temperature to plumed
MC_STEPS number of MC steps
MC_CHUNKSIZE MC chunksize
STATUS_FILE write a file with all the data useful for restart/continuation of Metainference
SELECTOR name of selector
NSELECT range of values for selector [0, N-1]
RESTART allows per-action setting of restart (YES/NO/AUTO)